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FOREWORD 


This Indian Standard (Part 1) (Third Revision) was adopted by the Bureau of Indian Standards, after the draft 
finalized by the Statistical Methods for Quality and Reliability Sectional Committee had been approved by the 
Management and Systems Division Council. 


This standard was first published in 1976 and revised in year 1985 and 1994. This revision is aimed to bring it in 
line with ISO 3534-1 : 2006 ‘Statistics — Vocabulary and symbols — Part 1: General statistical terms and terms 
used in probability’ issued by the International Organization for Standardization (ISO). However, with a view to 
make the various definitions easily understandable and comprehendable, based on comments received, certain 
modifications were made to ISO 3534-1. This standard is technically equivalent to ISO 3534-1 : 2006 where 
certain text has been added to the various terms but no text has as such been deleted from the definitions. 


Annex A gives a list of symbols and abbreviations recommended to be used for this standard. The entries in this 
standard are arranged in association with concept diagrams provided as Annexes B and C. Concept diagrams are 
provided in an informative Annex for each group of terms: (a) general statistical terms (see Annex B), and (b) 
terms used in probability (see Annex C). There are six concept diagrams for general statistical terms and four 
concept diagrams for terms related to probability. Some terms appear in multiple diagrams to provide a link from 
one set of concepts to another. Annex D provides a brief introduction to concept diagrams and their interpretation. 


These diagrams were instrumental in constructing this revision as they assist in delineating the interrelationships 
of the various terms. Further, in this standard the definitions generally relate to the one-dimensional (univariate) 
case and therefore, the one-dimensional scope for most of the definitions is not mentioned repetitively. 


During the formulation of this standard, considerable assistance has been taken from the following International 
Standards: 


ISO 31-11: 1992 Quantities and units — Part 11: Mathematical signs and symbols for use in the physical 
sciences and technology 


ISO 3534-2 : 2006 Statistics — Vocabulary and symbols — Part 2: Applied statistics 


VIM : 1993 International vocabulary of basic and general terms in metrology, BIPM, IEC, IFCC, 
ISO, OIML, IUPAC, IUPAP 
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Indian Standard 


STATISTICS — VOCABULARY AND SYMBOLS 
PART 1 GENERAL STATISTICAL TERMS AND TERMS USED IN PROBABILITY 


( Third Revision ) 


1 SCOPE 


This standard (Part 1) defines general statistical terms 
and terms used in probability. In addition, it defines 
symbols for a limited number of these terms. 


2 REFERENCES 


The following standards are necessary adjunct to this 
standard: 


IS No. Title 


15393 Accuracy (trueness and precision) of 
measurement methods and results: 
(Part 1) : 2003/ General principles and definitions 
ISO 5725-1: 

1994 

(Part 2) : 2003/ Basic method for the determination 
ISO 5725-2: of repeatability and reproducibility 
1994 of a standard measurement method 
(Part 3) : 2003/ Intermediate measures of the 
ISO 5725-3: precision of a standard measurement 
1994 method 

(Part 4) : 2003/ Basic methods for the determination 
ISO 5725-4: of the trueness of a standard 
1994 measurement method 

(Part 5) : 2003/ Alternative methods for the 
ISO 5725-5: | determination of the precision of a 
1998 standard measurement method 
(Part 6) : 2003/ Use in practice of accuracy values 
ISO 5725-6 : 

1994 


3 GENERAL STATISTICAL TERMS 


3.1 The terms are classified as: 


a) General statistical terms (see 3); and 
b) Terms used in probability (see 4). 


3.1.1 Population 


Totality of well-defined items under consideration. 


NOTES 


1 A population may be real and finite, real and infinite or 
completely hypothetical. Sometimes the term ‘finite 
population’ is used, especially in survey sampling. Likewise 
the term ‘infinite population’ is used in the context of sampling 
from a continuum. In 4, population will be viewed in a 
probabilistic context as the sample space (see 4.1). 


2 A hypothetical population allows one to imagine the nature 
of further data under various assumptions. Hence, hypothetical 


populations are useful at the design stage of statistical 
investigations, particularly for determining appropriate sample 
sizes. A hypothetical population could be finite or infinite in 
number. It is a particularly useful concept in inferential statistics 
to assist in evaluating the strength of evidence in a statistical 
investigation. 


3 The context of an investigation can dictate the nature of the 
population. For example, if three villages are selected for a 
demographic or health study, then the population consists of 
the residents of these particular villages. Alternatively, if the 
three villages were selected at random from among all of the 
villages in a specific region, then the population would consist 
of all residents of the region. 


3.1.2 Sampling Unit 


One of the individual parts into which a population 
(see 3.1.1) is divided. 


NOTE — Depending on the circumstances the smallest part of 
interest may be an individual, a household, a school district, 
an administrative unit and so forth. 


3.1.3 Sample 


Subset of a population (see 3.1.1) made up of one or 
more sampling units (see 3.1.2) and representative 
subset of the population. 


NOTE — The sampling units could be items, numerical values 
or even abstract entities depending on the population of interest. 


3.1.4 Observed Value 


Obtained value of a property associated with sampling 
unit (see 3.1.2). 


NOTES 


1 Common synonyms are ‘realization’ and ‘datum’. The plural 
of datum is data. 


2 The definition does not specify the genesis or how this value 
has been obtained. The value may represent one realization of 
a random variable (see 4.10), but not exclusively so. It may be 
one of several such values that will be subsequently subjected 
to statistical analysis. Although proper inferences require some 
statistical underpinnings, there is nothing to preclude 
computing summaries or graphical depictions of observed 
values. Only when attendant issues such as determining the 
probability of observing a specific set of realizations does the 
statistical machinery become both relevant and essential. The 
preliminary stage of an analysis of observed values is 
commonly referred to as data analysis. 


3.1.5 Descriptive Statistics 


Graphical, numerical or other summary depiction of 
observed values (see 3.1.4). 
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Examples: 


1) Numerical summaries include average (see 
3.1.15), range (see 3.1.10), sample standard 
deviation (see 3.1.17), and so forth. 


2) Examples of graphical summaries include 
boxplots, diagrams, Q-Q plots, normal 
quantile plots, scatterplots, multiple 
scatterplots, histograms and Multi Vary 
Analysis. 


3.1.6 Random Sample 


Sample (see 3.1.3) which has been selected by a method 
of random selection. 


NOTES 


1 When the sample of n sampling units is selected from a finite 
sample space (see 4.1), each of the possible combinations of n 
sampling units will have a particular probability (see 4.5) of 
being taken. For survey sampling plans, the particular 
probability for each possible combination may be calculated 
in advance so that there will not be any bias towards the 
selection. 


2 For survey sampling from a finite sample space, a random 
sample can be selected by different sampling plans such as 
stratified random sampling, systematic random sampling, 
cluster sampling, sampling with probability of sampling 
proportional to the size of an auxiliary variable and many other 
possibilities. 

3 The definition generally refers to actual observed values 
(see 3.1.4). These observed values are considered as realizations 
of random variables (see 4.10), where each observed value 
corresponds to one random variable. When estimators 
(see 3.1.12), test statistics for statistical tests (see 3.1.48) or 
confidence intervals (see 3.1.28) are derived from a random 
sample, the definition accommodates reference to the random 
variables arising from abstract entities in the sample rather than 
the actual observed values of these random variables. 

4 Random samples from infinite populations are often 
generated by repeated draws from the sample space, leading 
to a sample consisting of independent, identically distributed 
random variables using the interpretation of this definition 
mentioned in Note 3. 


3.1.7 Simple Random Sample 


<finite population> random sample (see 3.1.6) such 
that each subset of a given size has the same probability 
of selection. 


3.1.8 Statistic 


Completely specified function of random variables 
(see 4.10). 


NOTES 


1 A statistic is a function of random variables in a random 
sample (see 3.1.6) in the sense given in Note 3 of 3.1.6. 


2 Referring to Note 1 above, if {X, X,. .... ,X,) is a random 
sample from a normal distribution (see 4.50) with unknown 
mean (see 4.35) u and unknown standard deviation (see 4.37) 
o. Then the expression (X, + X, + ... + X,)/n is a statistic, the 
sample mean (see 3.1.15), whereas [(X, + X, +... + X )n]-y 
is not a statistic as it involves the unknown value of the 
parameter (see 4.9) u. 


3.1.9 Order Statistic 


Statistic (see 3.1.8) determined by its ranking in a non- 
decreasing arrangement of random variables (see 4.10). 


Example — Let the observed values of a sample be 9, 
13, 7, 6, 13, 7, 19, 6, 10, and 7. The observed values 
of the order statistics are 6, 6, 7, 7, 7, 9, 10, 13, 13, 19. 
These values constitute realizations of X,,, through 
X (jo: 


NOTES 
1 Let the observed values (see 3.1.4) of a random sample (see 
3.1.6) be {x,, x,..... .x,) and once sorted in non-decreasing 


order designated as x,, €... € xq, € ... € Xm Then (x 
Xp) is the observed value of the order statistic 
iege Xm) and x, is the observed value of the k? 
order statistic. 

2 In practical terms, obtaining the order statistics for a data set 
amounts to sorting the data as formally described in Note 1. 
The sorted form of the data set then lends itself to obtaining 
useful summary statistics as given in the next few definitions. 


3 Order statistics involve sample values identified by their 
position after ranking in non-decreasing order. As in the 
example, it is easier to understand the sorting of sample values 
(realizations of random variables) rather than the sorting of 
unobserved random variables. Nevertheless, one can conceive 
of random variables from a random sample (see 3.1.6) being 
arranged in a nondecreasing order. For example, the maximum 
of n random variables can be studied in advance of its realized 
value. 


4 An individual order statistic is a statistic which is a completely 
specified function of a random variable. This function is simply 
the identity function with the further identification of position 
or rank in the sorted set of random variables. 


5 Tied values pose a potential problem especially for discrete 
random variables and for realizations that are reported to low 
resolution. The word ‘non-decreasing’ is used rather than 
“ascending'as a subtle approach to the problem. It should be 
emphasized that tied values are retained and not collapsed into 
the single tied value. In the example above, the two realizations 
of 6 and 6 are tied values. 


6 Ordering takes place with reference to the real line and not 
to the absolute values of the random variables. 

7 The complete set of order statistics consist of an dimensional 
random variable, where n is the number of observations in the 
sample. 

8 The components of the order statistic are also referred to as 
order statistics but with a qualifier that gives the number in the 
sequence of ordered values of the sample. 

9 The minimum, the maximum and for odd numbered sample 
sizes, the sample median (see 3.1.13), are special cases of order 
statistics. For example, for sample size 11, X, is the minimum. 


(1) 
Xan is the maximum and X,, is the sample median. 


3.1.10 Sample Range 


Largest order statistic (see 3.1.9) minus the smallest 
order statistic, that is, the difference of the maximum 
and minimum values of the sample. 


Example — Continuing with the example from 3.1.9, 
the observed sample range is 19 — 6 = 13. 
NOTE — In statistical process control, the sample range is 


often used to monitor the dispersion over time of a process, 
particularly when the sample sizes are relatively small. 


3.1.11 Mid-Range 


Average (see 3.1.15) of smallest and largest order 
statistics (see 3.1.9) or the average of the maximum 
and minimum values. 


Example — The observed mid-range for the example 
values given in 3.1.9 is (6+19)/2 = 12.5. 


NOTE — The mid-range provides a quick and simple 
assessment of the middle of small data sets. 


3.1.12 Estimator @ 


Statistic (see 3.1.8) used in estimation (see 3.1.36) of 
the parameter 0. 


NOTES 


1 An estimator could be the sample mean (see 3.1.15) intended 
to estimate the population mean (see 4.35), which could be 
denoted by u. For a distribution (see 4.11) such as the normal 
distribution (see 4.50), the “natural” estimator of the population 
mean p is the sample mean. 


2 For estimating a population property [for example, the mode 
(see 4.27) for a univariate distribution (see 4.16)], an 
appropriate estimator could be a function of the estimator(s) 
of the parameter(s) of a distribution or could be a complex 
function of a random sample (see 3.1.6). 

3 The term “estimator” is used here in a broad sense. It includes 
the point estimator for a parameter, as well as the interval 
estimator which is possibly used for prediction (sometimes 
referred to as a predictor). Estimator also can include functions 
such as kernel estimators and other special purpose statistics. 
Additional discussion is provided in the notes to 3.1.36. 


3.1.13 Sample Median 


Sample median is that order statistic which divides the 
sample into two equal parts. It is calculated as [(n+1)/ 
2]" order statistic (see 3.1.9), if the sample size n is 
odd; sum of the (n/2)" and [(n/2) + 1]" order statistics 
divided by 2, if the sample size n is even. 


Example — Continuing with the example of 3.1.9, the 
value of 8 is a realization of the sample median. In this 
case (even sample size of 10), the 5" and 6" values 
were 7 and 9, whose average equals 8. In practice, this 
would be reported as “the sample median is 8”, 
although strictly speaking, the sample median is 
defined as a random variable. 


NOTES 


1 For a random sample (see 3.1.6) of sample size n whose 
random variables (see 4.10) are arranged in non-decreasing 
order from | to n. The sample median is the (n+1)/2" random 
variable if the sample size is odd. If the sample size n is even, 
then the sample median is the average of the (n/2)" and (n+1)/ 
2'0 random variables. 


2 Conceptually, it may seem impossible to conduct an ordering 
of random variables which have not yet been observed. 
Nevertheless, the structure for understanding order statistics 
can be established so that upon observation, the analysis may 
proceed. In practice, one obtains observed values and through 


IS 7920 (Part 1) : 2012 


sorting the values, one obtains realizations of the order statistics. 
These realizations can then be interpreted from the structure of 
order statistics from a random sample. 


3 The sample median provides an estimator of the middle of a 
distribution, with half of the sample to each side of it. 


4 In practice, the sample median is useful in providing an 
estimator that is insensitive to very extreme values in a data 
set. For example, median incomes and median housing prices 
are frequently reported as summary values. 


3.1.14 Sample Moment of Order k E( x*) 


Sum of K^ power of random variables (see 4.10) in a 
random sample (see 3.1.6) divided by the number of 
observations in the sample (see 3.1.3). 


NOTES 


1 For a random sample of sample size n. that is (X,, X,. .... X,], 
the sample moment of order k, E(X*), is 


1 n k 
ZN Y! 
mp3 ' 
2 Furthermore, this concept can be described as the sample 
moment of order k about zero. 

3 The sample moment of order | will be seen in the next 
definition to be the sample mean (see 3.1.15). 

4 Although the definition is given for arbitrary k. commonly 
used instances in practice involve k = 1 [sample mean 
(see 3.1.15)]. k = 2 [associated with the sample variance 
(see 3.1.16) and sample standard deviation (see 3.1.17)]. k = 3 
[related to sample coefficient of skewness (see 3.1.20)] and 
k = 4 [related to sample coefficient of kurtosis (see 3.1.21)]. 
5 The “E” in E(X*) comes from the “expected value” or 
“expectation” of the random variable X. 


3.1.15 Sample Mean Average 


Arithmetic mean sum of random variables (see 4.10) 
in a random sample (see 3.1.6) divided by the number 
of terms in the sum. 


Example — Continuing with the example from 3.1.9, 
the realization of the sample mean is 9.7 as the sum of 
the observed values is 97 and the sample size is 10. 


NOTES 


1 Considered as a statistic, the sample mean is a function of 
random variables from a random sample in the sense given in 
Note 3 of 3.1.8. One must distinguish this estimator from the 
numerical value of the sample mean calculated from the 
observed values (see 3.1.4) in the random sample. 

2 The sample mean considered as a statistic is often used as an 
estimator for the population mean (see 4.35). A common 
synonym is arithmetic mean. 

3 For a random sample of sample size n. that is (X,, X,, ..... ; 
X,}, the sample mean is: 


n 


X= 34 


nizi 
4 The sample mean can be recognized as the sample moment 
of order 1. 


5 For sample size 2, the sample mean, the sample median 
(see 3.1.13) and mid-range (see 3.1.11) are the same. 
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3.1.16 Sample Variance, S? 


Sum of squared deviations of random variables 
(see 4.10) in a random sample (see 3.1.6) from their 
sample mean (see 3.1.15) divided by the number of 
terms in the sum minus one. 


Example — Continuing with the numerical example of 
3.1.9, the sample variance can be computed to be 17.57. 
The sum of squares about the observed sample mean is 
158.10 and the sample size 10 minus 1 is 9, giving the 
appropriate denominator. 


NOTES 


1 Considered as a statistic (see 3.1.8), the sample variance $° 
is a function of random variables from a random sample. One 
has to distinguish this estimator (see 3.1.12) from the numerical 
value of the sample variance calculated from the observed 
values (see 3.1.4) in the random sample. This numerical value 
is called the empirical sample variance or the observed sample 
variance and is usually denoted by s?. 


2 For a random sample of sample size n that is {X,, X,. .... X,] 
with sample mean X the sample variance is: 


plc H 
S 2— (x - Xy 
N15] 


3 The sample variance is a statistic that is “almost” the average 
of the squared deviations of the random variables (see 4.10) 
from their sample mean (only “almost” since n - 1 is used rather 
than n in the denominator). Using n — 1 provides an unbiased 
estimator (see 3.1.34) of the population variance (see 4.36). 
4 The quantity n- 1 is known as the degrees of freedom 
(see 4.54). 

5 The sample variance can be recognized to be the 2"! sample 
moment of the standardized sample random variables 
(see 3.1.19). 


3.1.17 Sample Standard Deviation, S 


Non-negative square root of the sample variance 
(see 3.1.16). 


Example — Continuing with the numerical example 
of 3.1.9, the observed sample standard deviation is 
4.192 since the observed sample variance is 17.57. 


NOTES 


1 In practice, the sample standard deviation is used to estimate 
the standard deviation (see 4.37). Here again. it should be 
emphasized that S is also a random variable (see 4.10) and not 
a realization from a random sample (see 3.1.6). 


2 The sample standard deviation is a measure of the dispersion 
of a distribution (see 4.11). 


3.1.18 Sample Coefficient of Variation 


Itis the standard deviation per unit of the mean value. 
It is calculated as sample standard deviation 
(see 3.1.17) divided by the sample mean (see 3.1.15). 


NOTE — As with the coefficient of variation (see 4.38), the 
utility of this statistic is limited to populations that are positive 
valued. It is applicable where variation increases in proportion 
to mean. In situations where it is applicable, Mean. Standard 
deviation and coefficient of variation need to be monitored 
simultaneously. The coefficient of variation is commonly 
reported as a percentage. 


3.1.19 Standardized Sample Random Variable 


Random variable (see 4.10) minus its sample mean 
(see 3.1.15) divided by the sample standard deviation 
(see 3.1.17). 


Example — For the example of 3.1.9, the observed 
sample mean is 9.7 and the observed sample standard 
deviation is 4.192. Hence, the observed standardized 
random variables (to two decimal places) are: 


—0.17; 0.79; —0.64; —0.88; 0.79; —0.64; 2.22; —0.88; 
0.07; —0.64. 


NOTES 


1 The standardized sample random variable is distinguished 
from its theoretical counterpart standardized random variable 
(see 4.33). The intent of standardizing is to transform random 
variables to have zero means and unit standard deviations, for 
ease in interpretation and comparison. 


2 Standardized observed values have an observed mean of zero 
and an observed standard deviation of 1. 


3.1.20 Sample Coefficient of Skewness 


Itis a measure of lack of symmetry of the distribution. 
It is calculated as arithmetic mean of the third power 
of the standardized sample random variables 
(see 3.1.19) from a random sample (see 3.1.6). 


Example — Continuing with the example from 3.1.9, 
the observed sample coefficient of skewness can be 
computed to be 0.971 88. For a sample size such as 10 
in this example, the sample coefficient of skewness is 
highly variable, so it must be used with caution. Using 
the alternative formula in Note 1, the computed value 
is 1.349 83. 


NOTES 


1 The formula corresponding to the definition is 


iss 
n^ S 


Some statistical packages use the following formula for the 
sample coefficient of skewness to correct for bias (see 3.1.33): 


n e 73 
RIO: ANNI 
(n-1)(n-2) L : 
where 


z242 
S 


For a large sample size, the distinction between the two 
estimates is negligible. The ratio of the unbiased to the biased 
estimate is 1.389 for n = 10, 1.031 for n = 100 and 1.003 for 
n= 1 000. 

2 Skewed data would also be reflected in values of the sample 
mean (see 3.1.15) and sample median (see 3.1.13) that are 
dissimilar. Positively skewed (right-skewed) data indicate the 
possible presence of a few extreme, large observations. 
Similarly, negatively skewed (left-skewed) data indicate the 
possible presence of a few extreme, small observations. 


3 The sample coefficient of skewness can be recognized to be 


the 3' sample moment of the standardized sample random 
variables (see 3.1.19). 


3.1.21 Sample Coefficient of Kurtosis 


It measures the flatness or peakedness of the curve of 
the distribution. It is calculated as arithmetic mean of 
the fourth power of the standardized sample random 
variables (see 3.1.19) from a random sample 
(see 3.1.6). 


Example — Continuing with the example from 3.1.9, 
the observed sample coefficient of kurtosis can be 
computed to be 2.674 19. For such a sample size as 10 
in this example, the sample coefficient of kurtosis is 
highly variable, so it must be used with caution. 
Statistical packages use various adjustments in 
computing the sample coefficient of kurtosis (see Note 3 
of 4.40). Using the alternate formula given in Note 1, 
the computed value is 0.436 05. The two values 2.674 
19 and 0.436 05 are not comparable directly. To do so, 
take 2.674 19 — 3 (to relate to the kurtosis of the normal 
distribution which is 3) which equals —0.325 81 which 
now can be appropriately compared to 0.436 05. 


NOTES 


1 The formula corresponding to the definition is: 


Ix x-xY 

pi s | 
Some statistical packages use the following formula for the 
sample coefficient of kurtosis to correct for bias (see 3.1.33) 


and to indicate the deviation from the kurtosis of the normal 
distribution (which equals 3): 


n (n1) I 3(n-1y 
Z 
(1-10-20-52/^ (n —Z) (n-3) 
where 
PE e: 
HE 


The second term in the expression is approximately 3 for large 
n. Sometimes the kurtosis is reported as a value as defined 
in 4.40 minus 3 to emphasize comparisons to the normal 
distribution. Obviously, a practitioner needs to be aware of the 
adjustments, if any, in statistical package computations. 


2 For the normal distribution (see 4.50), the sample coefficient 
of kurtosis is approximately 3, subject to sampling variability. 
In practice, the kurtosis of the normal distribution provides a 
benchmark or baseline value. Distributions (see 4.11) with 
values smaller than 3 have lighter tails than the normal 
distribution; distributions with values larger than 3 have heavier 
tails than the normal distribution. 


3 For observed values of kurtosis much larger than 3, the 
possibility exists that the underlying distribution has genuinely 
heavier tails than the normal distribution. Another possibility 
to be investigated is the presence of potential outliers. 
Observations in a sample, so far separated in value from the 
remainder as to suggest that they may be from different 
population or the result of an error in measurement. 


4 The sample coefficient of kurtosis can be recognized to be 
the 4" sample moment of the standardized sample random 
variables. 


3.1.22 Sample Covariance, Sy, 
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Sum of products of deviations of pairs of random 
variables (see 4.10) in a random sample (see 3.1.6) 
from their sample means (see 3.1.15) divided by the 
number of terms in the sum minus one. 


Examples: 


1) Consider the following numerical illustration using 
10 observed 3-tuples (triplets) of values as given in 
Table 1. For this example, consider only x and y. 


Table 1 Results for Example 1 
(Clause 3.1.22) 


ta i 1 2 3 4 5 6 7 8 9 10 
No. 

i x 38 41 24 60 41 51 58 50 65 33 
ii) y 73 74 43 107 65 73 99 72 100 48 
iii) z 34 31 40 28 35 28 32 27 27 31 


The observed sample mean for X is 46.1 and for Y 
is 75.4. The sample covariance is equal to 


[(38 - 46.1) x (73 — 75.4) + (41 — 46.1) x (74 — 
75.4) +... + (33 — 46.1) x (48 — 75.4)]/9 = 257.178 


2) In the table of the previous example, consider only 
y and z. The observed sample mean for Z is 31.3. 
The sample covariance is equal to 


[(73 - 75.4) x (34 — 31.3) + (74 — 75.4) x (74 — 
31.3) +... + (48 — 75.4) x (31 - 31.3)]/9 =-54.356 


NOTES 


1 Considered as a statistic (see 3.1.8) the sample covariance is 
a function of pairs of random variables [(X,, Y,), (X,, Yo), ..., 
(X, Y, )] from a random sample of size n in the sense given in 
Note 3 of 3.1.6. This estimator (see 3.1.12) needs to be 
distinguished from the numerical value of the sample 
covariance calculated from the observed pairs of values of the 
sampling units (see 3.1.2) [(x,, y, NG, y)... (5, Y,)] in the 
random sample. This numerical value is called the empirical 
sample covariance or the observed sample covariance. 


2 The sample covariance Sx, is given as: 
Lou E " 
2X - X) -Y) 

ND 


3 Using n- 1 provides an unbiased estimator (see 3.1.34) of 
the population covariance (see 4.43). 


4 The example in Table | consists of three variables whereas 
the definition refers to a pair of variables. In practice, it is 
common to encounter situations with multiple variables. 


3.1.23 Sample Correlation Coefficient, r,, 


Correlation coefficient measures the degree of linear 
relationship between two variables. It is calculated as 
sample covariance (see 3.1.22) divided by the product 
of the corresponding sample standard deviations 
(see 3.1.17). 


Examples: 
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1) Continuing with Example 1 of 3.1.22, the observed 
standard deviation is12.948 for X and 21.329 for Y. 
Hence, the observed sample correlation coefficient 
(for X and Y) is given by: 


257.178/ (12.948 x 21.329) = 0.931 2 


2) Continuing with Example 2 of 3.1.22, the observed 
standard deviation is 21.329 for Y and 4.165 for Z. 
Hence, the observed sample correlation coefficient 
(for Y and Z) is given by: 


—54.356/ (21.329 x 4.165) = -0.612 
NOTES 


1 Notationally, the sample correlation coefficient is computed 
as: 


Yat - HY - Y) 


Ze -Xy Y (vy 


This expression is equivalent to the ratio of the sample 
covariance to the standard deviations. Sometimes the symbol 
r,, is used to denote the sample correlation coefficient. The 
observed sample correlation coefficient is based on realizations 
(34; y. lay Yo) s Qn y). 

2 The observed sample correlation coefficient can take on values 
in [-1. 1]. With values near 1 indicating strong positive 
correlation and values near -1 indicating strong negative 
correlation. Values near 1 or —1 indicate that the points are 
nearly on a straight line and values near 0 indicate no linear 
relationship. 


3.1.24 Standard Error g0 


Standard deviation (see 4.37) of an estimator 
(see 3.1.12) 6. 


Example — If the sample mean (see 3.1.15) is the 
estimator of the population mean (see 4.35) and the 
standard deviation of a single random variable 
(see 4.10) is o. Then the standard error of the sample 
mean is og//n where n is the number of observations 
in the sample. An estimator of the standard erroris Sin 
where S is the sample standard deviation (see 3.1.17). 


NOTES 


1 In practice, the standard error provides a natural estimate of 
the standard deviation of an estimator. 


2 There is no (sensible) complementary term “non-standard” 
error. Standard error can be viewed as an abbreviation for the 
expression “standard deviation of an estimator”. Commonly, 
in practice, standard error is implicitly referring to the standard 
deviation of the sample mean. The notation for the standard 
error of the sample mean is ox. 


3.1.25 Interval Estimator 


Interval, bounded by an upper limit statistic (see 3.1.8) 
and a lower limit statistic, with desired level of 
confidence, to estimate unknown population parameter. 


NOTES 

1 One of the end points could be +, —ee or a natural limit of 
the value of a parameter. For example, 0 is a natural lower limit 
for an interval estimator of the population variance (see 4.36). 
In such cases, the intervals are commonly referred to as one- 
sided intervals. 

2 An interval estimator can be given in conjunction with 
parameter (see 4.9) estimation (see 3.1.36). The interval 
estimator is presumed to contain a parameter on a stated 
proportion of occasions, under conditions of repeated sampling, 
or in some other probabilistic sense. 

3 Three common types of interval estimators include confidence 
intervals (see 3.1.28) for parameter(s), prediction intervals 
(see 3.1.30) for future observations, and statistical tolerance 
intervals (see 3.1.26) on the proportion of a distribution 
(see 4.11) contained. 


3.1.26 Statistical Tolerance Interval 


Interval determined from a random sample (see 3.1.6) 
in such a way that one may have a specified level of 
confidence that the interval covers at least a specified 
proportion of the sampled population (see 3.1.1). 


NOTE — The confidence in this context is the long-run 
proportion of intervals constructed in this manner that will 
include at least the specified proportion of the sampled 
population. 


3.1.27 Statistical Tolerance Limit 


Statistic (see 3.1.8) representing an end point of a 
statistical tolerance interval (see 3.1.26). 


NOTE — Statistical tolerance intervals may be either 


a) one-sided (with one of its limits fixed at the natural boundary 
of the random variable), in which case they have either an 
upper or a lower statistical tolerance limit, or 


b) two-sided, in which case they have both. 


A natural boundary of the random variable may provide a limit 
for a one-sided limit. 


3.1.28 Confidence Interval 


Interval estimator (see 3.1.25) (To, T,) for the parameter 
(see 4.9) O with the statistics (see 3.1.8) T) and T, 
as interval limits and for which it holds that 
PIT «0«T|]21-a. 


NOTES 


1 The confidence reflects the proportion of cases that the 
confidence interval would contain the true parameter value 
in a long series of repeated random samples (see 3.1.6) under 
identical conditions. A confidence interval does not reflect 
the probability (see 4.5) that the observed interval contains 
the true value of the parameter (it either does or does not 
contain it). 


2 Associated with this confidence interval is the attendant 
performance characteristic 100 (1 — a) percent, where a is 
generally a small number. The performance characteristic, 
which is called the confidence coefficient or confidence level, 
is often 95 percent or 99 percent. The inequality P[T, « 0 « T, ] 
> 1 — & holds for any specific but unknown population value 
of 0. 


3.1.29 One-Sided Confidence Interval 


Confidence interval (see 3.1.28) with one of its end 
points fixed at eo, -co, or a natural fixed boundary. 


NOTES 


1 Definition 3.1.28 applies with either T, set at -co or T, set at 
+ co, One-sided confidence intervals arise in situations where 
interest focuses strictly on one direction. For example, in audio 
volume testing for safety concerns in cellular telephones, an 
upper confidence limit would be of interest indicating an upper 
bound for the volume produced under presumed safe 
conditions. For structural mechanical testing, a lower 
confidence limit on the force at which a device fails would be 
of interest. 


2 Another instance of one-sided confidence intervals occurs in 
situations where a parameter has a natural boundary such as 
zero. For a Poisson distribution (see 4.47) involved in modelling 
customer complaints, zero is a lower bound. As another 
example, a confidence interval for the reliability of an electronic 
component could be (0.98, 1), where 1 is the natural upper 
boundary limit. 


3.1.30 Prediction Interval 


Range of values of a variable, derived from a random 
sample (see 3.1.6) of values from a continuous 
population, within which it can be asserted with a given 
confidence that no fewer than a given number of values 
in a further random sample from the same population 
(see 3.1.1) will fall. 


NOTE — Commonly, interest focuses on a single further 
Observation arising from the same situation as the observations 
which are the basis of the prediction interval. Another practical 
context is regression analysis in which a prediction interval is 
constructed for a spectrum of independent values. 


3.1.31 Estimate 


Observed value (see 3.1.4) of an estimator (see 3.1.12). 


NOTE — Estimate refers to a numerical value obtained from 
observed values. With respect to estimation (see 3.1.36) of a 
parameter (see 4.9) from a hypothesized probability distribution 
(see 4.11), estimator refers to the statistic (see 3.1.8) intended 
to estimate the parameter and estimate refers to the result using 
observed values. Sometimes the adjective “point” is inserted 
before estimate to emphasize that a single value is being 
produced rather than an interval of values. Similarly, the 
adjective "interval" is inserted before estimate in cases where 
interval estimation is taking place. 


3.1.32 Error of Estimation 


It is the difference between parametric value and the 
estimate or in other words, estimate (see 3.1.31) minus 
the parameter (see 4.9) or population property that it 
is intended to estimate. 


NOTES 


1 Population property may be a function of the parameter or 
parameters or another quantity related to the probability 
distribution (see 4.11). 

2 Estimator error could involve contributions due to sampling, 
measurement uncertainty, rounding, or other sources. In effect, 
estimator error represents the bottom line performance of 
interest to practitioners. Determining the primary contributors 
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to estimator error is a critical element in quality improvement 
efforts. 


3.1.33 Bias 


Expectation (see 4.12) of error of estimation 
(see 3.1.32). 


NOTES 


1 Here bias is used in a generic sense as indicated in Note 1 
in 3.1.34. 


2 The existence of bias can lead to unfortunate consequences 
in practice. For example, underestimation of the strength of 
materials due to bias could lead to unexpected failures of a 
device. In survey sampling, bias could lead to incorrect 
decisions from a political poll. 


3.1.34 Unbiased Estimator 


It is the estimator which will be on the average equal 
to the population parameter value or estimator 
(see 3.1.12) having bias (see 3.1.33) equal to zero. 


Examples: 


1) For a random sample (see 3.1.6) of n 
independent random variables (see 4.10), each 
with the same normal distribution (see 4.50) 
with mean p (see 4.35) and standard deviation 
(see 4.37) o. The sample mean X (see 3.1.15) 
and the sample variance (see 3.1.16) S? are 
unbiased estimators for the mean u and the 
variance (see 4.36) o”, respectively. 


2) As is mentioned in Note 1 to 3.1.37 the 
maximum likelihood estimator (see 3.1.35) 
of the variance 6? uses the denominator n 
instead of n - 1 and thus is a biased estimator. 
In applications, the sample standard deviation 
(see 3.1.17) receives considerable use but it 
is important to note that the square root of 
the sample variance using n — 1 is a biased 
estimator of the population standard deviation 
(see 4.37). 


3) Fora random sample of n independent pairs 
of random variables, each pair with the same 
bivariate normal distribution (see 4.65) with 
covariance (see 4.43) equal to po,y, the 
sample covariance (see 3.1.22) is an unbiased 
estimator for population covariance. The 
maximum likelihood estimator uses n instead 
of n - | in the denominator and thus is biased. 


NOTE — Estimators that are unbiased are desirable in 
that on average, they give the correct value. Certainly, 
unbiased estimators provide a useful starting point in 
the search for “optimal” estimators of population 
parameters. The definition given here is of a statistical 
nature. 


In everyday usage, practitioners try to avoid introducing 
bias into a study by ensuring, for example, that the 
random sample is representative of the population of 
interest. 
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3.1.35 Maximum Likelihood Estimator 


Estimator (see 3.1.12) assigning the value of the 
parameter (see 4.9) where the likelihood function 
(see 3.1.38) attains or approaches its highest value. 


NOTES 


1 Maximum likelihood estimation is a well-established 
approach for obtaining parameter estimates where a distribution 
(see 4.11) has been specified [for example, normal (see 4.50), 
gamma (see 4.56), Weibull (see 4.63), and so forth]. These 
estimators have desirable statistical properties (for example, 
invariance under monotone transformation) and in many 
situations provide the estimation method of choice. In cases in 
which the maximum likelihood estimator is biased, a simple 
bias (see 3.1.33) correction sometimes takes place. As 
mentioned in Example 2 of 3.1.34 the maximum likelihood 
estimator for the variance (see 4.36) of the normal distribution 
is biased but it can be corrected by using n — 1 rather than n. 
The extent of the bias in such cases decreases with increasing 
sample size. 


2 The abbreviation MLE is commonly used both for maximum 
likelihood estimator and maximum likelihood estimation with 
the context indicating the appropriate choice. 


3.1.36 Estimation 


Procedure that obtains a statistical representation of a 
population (see 3.1.1) from a random sample (see 3.1.6) 
drawn from this population. 


NOTES 


1 In particular, the procedure involved in progressing from an 
estimator (see 3.1.12) to a specific estimate (see 3.1.31) 
constitutes estimation. 


2 Estimation is understood in a rather broad context to include 
point estimation, interval estimation or estimation of properties 
of populations. 


3 Frequently, a statistical representation refers to the estimation 
of a parameter (see 4.9) or parameters or a function of 
parameters from an assumed model. More generally, the 
representation of the population could be less specific, such as 
statistics related to impacts from natural disasters (casualties. 
injuries. Property losses and agricultural losses — all of which 
an emergency manager might wish to estimate). 


4 Consideration of descriptive statistics (see 3.1.5) could 
suggest that an assumed model provides an inadequate 
representation of the data, such as indicated by a measure of 
the goodness of fit of the model to the data. In such cases, 
other models could be considered and the estimation process 
continued. 


3.1.37 Maximum Likelihood Estimation 


Estimation (see 3.1.36) based upon the maximum 
likelihood estimator (see 3.1.35). 


NOTES 


1 For the normal distribution (see 4.50), the sample mean 
(see 3.1.15) is the maximum likelihood estimator (see 3.1.35) 
of the parameter (see 4.9) u, while the sample variance 
(see 3.1.16), using the denominator n rather than n - 1, provides 
the maximum likelihood estimator of o”. The denominator n - 1 
is typically used since this value provides an unbiased estimator 
(see 3.1.34). 

2 Maximum likelihood estimation is sometimes used to 
describe the derivation of an estimator (see 3.1.12) from the 
likelihood function. 


3 Although in some cases. a closed-form expression emerges 
using maximum likelihood estimation, there are other situations 
in which the maximum likelihood estimator requires an iterative 
solution to a set of equations. 


4 The abbreviation MLE is commonly used both for maximum 
likelihood estimator and maximum likelihood estimation with 
the context indicating the appropriate choice. 


3.1.38 Likelihood Function 


Probability density function (see 4.26) evaluated at the 
observed values (see 3.1.4) and considered as a function 
of the parameters (see 4.9) of the family of distributions 
(see 4.8). 


Examples: 


1) Consider a situation in which ten items are 
selected at random from a very large 
population (see 3.1.1) and 3 of the items are 
found to have a specific characteristic. From 
this sample, an intuitive estimate (see 3.1.31) 
of the population proportion having the 
characteristic is 0.3 (3 out of 10). Under a 
binomial distribution (see 4.46) model, the 
likelihood function (probability mass function 
as a function of p with n fixed at 10 and x at 
3) achieves its maximum at p = 0.3, thus 
agreeing with intuition. [This can be further 
verified by plotting the probability mass 
function of the binomial distribution 
(see 4.46), 120 p? (1 — pY versus p.] 


2) For the normal distribution (see 4.50) with 
known standard deviation (see 4.37), it can 
be shown in general that the likelihood 
function takes its maximum at p equal to the 
sample mean. 


3.1.39 Profile Likelihood Function 


Likelihood function (see 3.1.38) as a function of a 
single parameter (see 4.9) with all other parameters 
set to maximize it. 


3.1.40 Hypothesis, H 


Statement about a population (see 3.1.1). 
NOTE — Commonly the statement about the population 


concerns one or more parameters (see 4.9) in a family of 
distributions (see 4.8) or about the family of distributions. 


3.1.41 Null Hypothesis, H, 


Hypothesis (see 3.1.40) to be tested by means of a 
statistical test (see 3.1.48). 


Examples: 


1) Inarandom sample (see 3.1.6) of independent 
random variables (see 4.10) with the same 
normal distribution (see 4.50) with unknown 


mean (see 4.35) and unknown standard 
deviation (see 4.37), a null hypothesis for the 
mean u may be that the mean is less than or 
equal to a given value u, and this is usually 
written in the following way: Hy: u < Uy. 
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Examples: 


1) 


The alternative hypothesis to the null 
hypothesis given in Example 1 of 3.1.41 is 
that the mean (see 4.35) is larger than the 
specified value, which is written in the 


2) A null hypothesis may be that the statistical following way: H,: u > Hy. 
model for a population (see 3.1.1) is a normal tee ae 
OV ; : 2) The alternative hypothesis to the null 
distribution. For this type of null hypothesis, MNA ; : 
the mean and standard deviation are not hypothesis given mi Exsinples or 3-LALAS 
specified that the statistical model of the population is 
P ` . n not a normal distribution (see 4.50). 
3) A null hypothesis may be that the statistical 


model for a population consists of a 
symmetric distribution. For this type of null 
hypothesis, the form of the distribution is not 
specified. 


NOTES 


1 Explicitly, the null hypothesis can consist of a subset 
from a set of possible probability distributions. 

2 This definition should not be considered in isolation 
from alternative hypothesis (see 3.1.42) and statistical 
test (see 3.1.48), as proper application of hypothesis 
testing requires all of these components. 


3 In practice, one never proves the null hypothesis, but 
rather the assessment in a given situation may be 
inadequate to reject the null hypothesis. The original 
motivation for conducting the hypothesis test would 
likely have been an expectation that the outcome would 
favour a specific alternative hypothesis relevant to the 
problem at hand. 


4 Failure to reject the null hypothesis is not “proof” of 
its validity but may rather be an indication that there is 
insufficient evidence to dispute it. Either the null 
hypothesis (or a close proximity to it) is in fact true, or 
the sample size is insufficient to detect a difference from 
it. 

5 In some situations, initial interest is focused on the 
null hypothesis, but the possibility of a departure may 
be of interest. Proper consideration of sample size and 
power in detecting a specific departure or alternative 
can lead to the construction of a test procedure for 
appropriately assessing the null hypothesis. 


6 The acceptance of the alternative hypothesis in 
contrast to failing to reject the null hypothesis is a 
positive result in that it supports the conjecture of 
interest. Rejection of the null hypothesis in favour of 
the alternative is an outcome with less ambiguity than 
an outcome such as “failure to reject the null hypothesis 
at this time.” 


7 The null hypothesis is the basis for constructing the 
corresponding test statistic (see 3.1.52) used to assess 
the null hypothesis. 


8 The null hypothesis is often denoted H, (H having a 
subscript of zero although the zero is sometimes 
pronounced ‘oh’ or ‘nought’). 


9 The subset identifying the null hypothesis should, if 
possible, be selected in such a way that the statement is 
incompatible with the conjecture to be studied. See Note 
2 to 3.1.48 and the example given in 3.1.49. 


3) 


The alternative hypothesis to the null 
hypothesis given in Example 3 of 3.1.41 is 
that the statistical model of the population 
consists of an asymmetric distribution. For 
this alternative hypothesis, the specific form 
of asymmetry is not specified. 


NOTES 


1 Statement which selects a set or a subset of all possible 
admissible probability distributions (see 4.11) which 
do not belong to the null hypothesis (see 3.1.41). 


2 The alternative hypothesis can also be denoted H, or 
H, with no clear preference as long as the symbolism 
parallels the null hypothesis notation. 


3 The alternative hypothesis is a statement which 
contradicts the null hypothesis. The corresponding test 
statistic (see 3.1.52) is used to decide between the null 
and alternative hypotheses. 


4 The alternative hypothesis should not be considered 
in isolation from the null hypothesis nor statistical 
test (see 3.1.48). 

5 The acceptance of the alternative hypothesis in contrast 
to failing to reject the null hypothesis is a positive result 
in that it supports the conjecture of interest. 


3.1.43 Simple Hypothesis 


Hypothesis (see 3.1.40) that specifies a single 
distribution in a family of distributions (see 4.8). 


NOTES 

1A simple hypothesis is a null hypothesis (see 3.1.41) or 
alternative hypothesis (see 3.1.42) for which the selected subset 
consists of only a single probability distribution (see 4.11). 


2 In a random sample (see 3.1.6) of independent random 
variables (see 4.10) with the same normal distribution (see 4.50) 
with unknown mean (see 4.35) and known standard deviation 
(see 4.37) ©. A simple hypothesis for the mean p is that the 
mean is equal to a given value pi, and this is usually written in 
the following way: H,: u = Mo: 


3 A simple hypothesis specifies the probability distribution (see 
4.11) completely. 


3.1.44 Composite Hypothesis 


Hypothesis (see 3.1.40) that specifies more than one 
distribution (see 4.11) in a family of distributions 
(see 4.8). 


Examples: 


1) The null hypotheses (see 3.1.41) and the 
alternative hypotheses (see 3.1.42) given in 


3.1.42 Alternative Hypothesis H,. H, 


The alternative hypothesis is the complement of the 
null hypothesis. 
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the examples in 3.1.41 and 3.1.42 are all 
examples of composite hypotheses. 


2) In 3.1.48, the null hypothesis in Case 3 of 
Example 3 is a simple hypothesis. The null 
hypothesis in Example 4 is also a simple 
hypothesis. The other hypotheses in 3.1.48 are 
composite. 


NOTE — A composite hypothesis is a null hypothesis 
or alternative hypothesis for which the selected subset 
consists of more than a single probability distribution. 


3.1.45 Significance Level a 


<Statistical test> maximum probability (see 4.5) of 
rejecting the null hypothesis (see 3.1.41) when in fact 
it is true. 


NOTE — If the null hypothesis is a simple hypothesis 
(see 3.1.43), then the probability of rejecting the null hypothesis 
if it were true becomes a single value. 


3.1.46 Type I Error 


Rejection of the null hypothesis (see 3.1.41) when in 
fact it is true. 


NOTES 


1 In fact, a Type I error is an incorrect decision. Hence, it is 
desired to keep the probability (see 4.5) of making such an 
incorrect decision as small as possible. To obtain a zero 
probability of a Type I error, one would never reject the null 
hypothesis. In other words, regardless of the evidence, the same 
decision is made. 


2 It is possible that in some situations (for example, testing the 
binomial parameter p) that a prespecified significance level 
such as 0.05 is not attainable due to discreteness of outcomes. 


3.1.47 Type II Error 


Failure to reject the null hypothesis (see 3.1.41) when 
in fact the null hypothesis is not true. 


NOTE — In fact, a Type II error is an incorrect decision. Hence, 
it is desired to keep the probability (see 4.5) of making such 
an incorrect decision as small as possible. Type II errors 
commonly occur in situations where the sample sizes are 
insufficient to reveal a departure from the null hypothesis. 


3.1.48 Statistical Test/Significance Test 


It is the rule based on experimental values, providing 
the answer whether to reject or otherwise the null 
hypothesis under consideration in favour of an 
alternative hypothesis (see 3.1.42). 


Examples: 


1) As an example, if an actual, continuous 
random variable (see 4.29) can take values 
between —ee and +% and one has a suspicion 
that the true probability distribution is not a 
normal distribution (see 4.50), then the 
hypotheses will be formulated, as follows: 


a) The scope of the situation is all 
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2) 


3) 


4) 


continuous probability distributions 
(see 4.23), which can take values between 
—eo and +o. 

b) The conjecture is that the true probability 
distribution is not a normal distribution. 


c) Thenull hypothesis is that the probability 
distribution is a normal distribution. 


d) The alternative hypothesis is that the 
probability distribution is not a normal 
distribution. 


If the random variable follows a normal 
distribution with known standard deviation 
(see 4.37) and one suspects that its expectation 
value u deviates from a given value Uy, then 
the hypotheses will be formulated according 
to Case 3 in the next example. 


This example considers three possibilities in 
statistical testing. 


Case 1—It is conjectured that the process 
mean is higher than the target mean of ug. This 
conjecture leads to the following hypotheses: 


Null hypothesis: Ho: u < Mo 
Alternative hypothesis: H,: u > Ho 


Case 2—It is conjectured that the process 
mean is lower than the target mean of ug. This 
conjecture leads to the following hypotheses: 


Null hypothesis: Ho: u > Mo 
Alternative hypothesis: H,: u < Mo 


Case 3—It is conjectured that the process 
mean is not compatible with the process mean 
but the direction is not specified. This 
conjecture leads to the following hypotheses: 


Null hypothesis: Hy: u = ko 
Alternative hypothesis: H,: u # Uy 


In all three cases, the formulation of the 
hypotheses was driven by a conjecture 
regarding the alternative hypothesis and its 
departure from a baseline condition. 


This example considers as its scope all 
proportions p, and p, between zero and one 
of defectives in two lots | and 2. One might 
suspect that the two lots are different and 
therefore conjecture that the proportions of 
defects in the two lots are different. This 
conjecture leads to the following hypotheses: 


Null hypothesis: Ho: p, = p; 
Alternative hypothesis: H,: p, # p» 
NOTES 


1 A statistical test is a procedure, which is valid under 
specified conditions, to decide, by means of 


observations from a sample, whether the true probability 
distribution belongs to the null hypothesis or the 
alternative hypothesis. 


2 Before a statistical test is carried out the possible set 
of probability distributions is at first determined on the 
basis of the available information. Next the probability 
distributions, which could be true on the basis of the 
conjecture to be studied, are identified to constitute the 
alternative hypothesis. Finally, the null hypothesis is 
formulated as the complement to the alternative 
hypothesis. In many cases, the possible set of probability 
distributions and hence also the null hypothesis and the 
alternative hypothesis can be determined by reference 
to sets of values of relevant parameters. 


3 As the decision is made on the basis of observations 
from a sample, it may be erroneous leading to either a 
Type I error (see 3.1.46), rejecting the null hypothesis 
when in fact it is correct, or a Type II error (see 3.1.47), 
failure to reject the null hypothesis in favour of the 
alternative hypothesis when the alternative hypothesis 
is true. 

4 Case 1 and Case 2 of Example 3 above are instances of 
one-sided tests. Case 3 is an example of a two-sided test. 
In all three of these cases, the one-sided versus two-sided 
qualifier is determined by consideration of the region of 
the parameter u corresponding to the alternative 
hypothesis. More generally, one-sided and two-sided tests 
can be governed by the region for rejection of the null 
hypothesis corresponding to the chosen test statistic. That 
is, the test statistic has an associated critical region 
favouring the alternative hypothesis, but it may not relate 
directly to a simple description of the parameter space as 
in Cases 1, 2 and 3. 

5 Careful attention to the underlying assumptions must 
be made or the application of statistical testing may be 
flawed. Statistical tests that lead to stable inferences 
even under possible mis-specification of the underlying 
assumptions are referred to as robust. The one-sample t 
test for the mean is an example of a test considered 
very robust under non-normal distributions. Bartlett's 
test for homogeneity of variances is an example of a 
non-robust procedure, possibly leading to the excessive 
rejection of equality of variances in distributional cases 
for which the variances were in fact identical. 


3.1.49 p-value 


It is also known as “probability value’. It is the largest/ 
maximum significance level at which null hypothesis 
is accepted probability (see 4.5) of observing the 
observed test statistic (see 3.1.52) value or any other 
value at least as unfavourable to the null hypothesis 
(see 3.1.41). 


Example — Consider the numerical example originally 
introduced in 3.1.9. Suppose for illustration that these 
values are observations from a process that is nominally 
expected to have a mean of 12.5, and from previous 
experience the engineer associated with the process 
felt that the process was consistently lower than the 
nominal value. A study was undertaken and a random 
sample of size 10 was collected with the numerical 
results from 3.1.9. The appropriate hypotheses are: 


Null hypothesis: Hp: u > 12.5 
Alternative hypothesis: H,: u < 12.5 


11 
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The sample mean is 9.7 which is in the direction of the 
conjecture, but is it sufficiently far from 12.5 to support 
the conjecture. For this example the test statistic 
(see 3.1.52) is —1.976 4 with corresponding p-value 
0.040. This means that there are less than four chances 
in one hundred of observing a test statistic value of 
—1.976 4 or lower, if in fact the true process mean is at 
12.5. If the original prespecified significance level had 
been 0.05, then typically one would reject the null 
hypothesis in favour of the alternative hypothesis. 


Suppose alternatively that the problem were formulated 
somewhat differently. Imagine that the concern was 
that the process was off the 12.5 target but the direction 
was unspecified. This leads the following hypotheses: 


Null hypothesis: Ay: u = 12.5 
Alternative hypothesis: H,: u 4 12.5 


Given the same data collected from a random sample, 
the test statistic is the same, —1.976 4. For this 
alternative hypothesis, a question of interest is “what 
is the probability of seeing such an extreme value or 
more extreme?”. In this case, there are two relevant 
regions, values less than or equal to —1.976 4 or values 
greater than or equal to 1.976 4. The probability of a 
test statistic occurring in one of these regions is 0.080 
(twice the one-sided value). There are eight chances 
in one hundred of observing a test statistic value this 
extreme or more so. Thus, the null hypothesis is not 
rejected at the significance level 0.05. 


NOTES 


1 If the p-value, for example, turns out to be 0.029, then there 
are less than three chances in one hundred that such an extreme 
value of the test statistic or a more extreme one, would occur 
under the null hypothesis. On the basis of this information, 
one might feel compelled to reject the null hypothesis, as this 
is a fairly small p-value. More formally, if the significance 
level had been established as 0.05, then definitely the p-value 
of 0.029 being less than 0.05 would lead to the rejection of the 
null hypothesis. 

2 The term p-value is sometimes referred to as the significance 
probability which should not be confused with significance 
level (see 3.1.45) which is a specified constant in an application. 


3.1.50 Power of a Test 


It is the probability of rejecting the null hypothesis, 
when the null hypothesis is not true. It is one minus 
the probability (see 4.5) of the Type II error (see 3.1.47). 


NOTES 


1 The power of the test for a specified value of an unknown 
parameter (see 4.9) in a family of distributions (see 4.8) equals 
the probability of rejecting the null hypothesis (see 3.1.41) for 
that parameter value. 


2 In most cases of practical interest, increasing the sample size 
will increase the power of a test. In other words, the probability 
of rejecting the null hypothesis, when the alternative hypothesis 
(see 3.1.42) is true increases with increasing sample size, 
thereby reducing the probability of a Type II error. 


3 It is desirable in testing situations that as the sample size 
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becomes extremely large, even small departures from the null 
hypothesis ought to be detected, leading to the rejection of the 
null hypothesis. In other words, the power of the test should 
approach 1 for every alternative to the null hypothesis as the 
sample size becomes infinitely large. Such tests are referred to 
as consistent. In comparing two tests with respect to power, 
the test with the higher power is deemed the more efficient 
provided the significance levels are identical as well as the 
particular null and alternative hypotheses. 


3.1.51 Power Curve 


Collection of values of the power of a test (see 3.1.50) 
as a function of the population parameter (see 4.9) from 
a family of distributions (see 4.8). 


NOTE — The power function is equal to one minus the 
operating characteristic curve. 


3.1.52 Test Statistic 


Statistic (see 3.1.8) used in conjunction with a statistical 
test (see 3.1.48). 


NOTE — The test statistic is used to assess whether the 
probability distribution (see 4.11) at hand is consistent with 
the null hypothesis (see 3.1.41) or the alternative hypothesis 
(see 3.1.42). 


3.1.53 Graphical Descriptive Statistics 


Descriptive statistics (see 3.1.5) in pictorial form. 


NOTE — The intent of descriptive statistics is generally to 
reduce a large number of values to a manageable few or to 
present the values in a way to facilitate visualization. Examples 
of graphical summaries include boxplots, probability plots, 
Q-Q plots, normal quantile plots, scatterplots, multiple 
scatterplots, multi vary analysis and histograms (see 3.1.61). 


3.1.54 Numerical Descriptive Statistics 
Descriptive statistics (see 3.1.5) in numerical form. 


NOTE — Numerical descriptive statistics include average 
(see 3.1.15), sample range (see 3.1.10), sample standard 
deviation (see 3.1.17), interquartile range, and so forth. 


3.1.55 Classes 


NOTE — The classes are assumed to be mutually exclusive 
and exhaustive. The real line is all the real numbers between 
—co and +co, 


3.1.55.1 Class 


«Qualitative characteristic> subset of items from a 
sample (see 3.1.3). 


3.1.55.2 Class 


<Ordinal characteristic> set of one or more adjacent 
categories on an ordinal scale. 


3.1.55.3 Class 
<Quantitative characteristic> interval of the real line. 
3.1.56 Class Limits/Class Boundaries 


<Quantitative characteristic> values defining the upper 
and lower bounds of a class (see 3.1.55). 


NOTE — This definition refers to class limits associated with 
quantitative characteristics. 


3.1.57 Mid-point of Class 


It is also known as ‘class mark’, because this value 
can be used to represent the whole class. <Quantitative 
characteristic> average (see 3.1.15) of upper and lower 
class limits (see 3.1.56). 


3.1.58 Class Width 


<Quantitative characteristic> upper limit of a class 
minus the lower limit of a class (see 3.1.55). 


3.1.59 Frequency 


Number of occurrences or observed values (see 3.1.4) 
in a specified class (see 3.1.55). 


3.1.60 Frequency Distribution 


Empirical relationship between classes (see 3.1.55) and 
their number of occurrences or observed values 
(see 3.1.4). 


3.1.61 Histogram 


Graphical representation of a frequency distribution 
(see 3.1.60) consisting of contiguous rectangles, each 
with base width equal to the class width (see 3.1.58) 
and area proportional to the class frequency. 


NOTE — Care needs to be taken for situations in which the 
data arises in classes having unequal class widths. 


3.1.62 Bar Chart 


Graphical representation of a frequency distribution 
(see 3.1.60) of a nominal property consisting of a set 
of rectangles of uniform width with height proportional 
to frequency (see 3.1.59). 


NOTES 


1 The rectangles are sometimes depicted as three-dimensional 
images for apparently aesthetic purposes, although this adds 
no additional information and is not a recommended 
presentation. For a bar chart, the rectangles need not be 
contiguous. 


2 The distinction between histograms and bar charts has become 
increasingly blurred as available software does not always 
follow the definitions given here. 


3.1.63 Cumulative Frequency 


Frequency (see 3.1.59) for classes up to and including 
a specified limit. 


NOTE — This definition is only applicable for specified values 
that correspond to class limits (see 3.1.56). 


3.1.64 Relative Frequency 


Frequency (see 3.1.59) divided by the total number of 
occurrences or observed values (see 3.1.4). 


3.1.65 Cumulative Relative Frequency 


Cumulative frequency (see 3.1.63) divided by the total 
number of occurrences or observed values (see 3.1.4). 


4 TERMS USED IN PROBABILITY 


4.1 Sample Space O 
Set of all possible outcomes of a random experiment. 
Examples : 


1) Consider the failure times of batteries 
purchased by a consumer. If the battery has 
no power upon initial use, its failure time is 
0. If the battery does function for a while, it 
produces a failure time of some number of 
hours. The sample space therefore consists of 
the outcomes {battery fails upon initial 
attempt} and {battery fails after x hours where 
x is greater than zero hours}. This example 
will be used throughout this clause. In 
particular, an extensive discussion of this 
example is given in 4.68. 

2) A box contains 10 resistors that are labelled 

1, 2, 3, 4, 5, 6, 7, 8, 9, 10. If two resistors 

were randomly sampled without replacement 

from this collection of resistors, the sample 
space consists of the following 45 outcomes: 

(1, 2), (1, 3), (1,4), (1, 5), (1, 6), (1, 7), (1, 8), 

(1,9), (1, 10), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), 

(2, 8), (2, 9), (2, 10), (3, 4), (3, 5), 3,6), (3, 7), 

(3, 8), (3, 9), (3, 10), (4, 5), (4, 6), (4, 7), (4, 8), 

(4, 9), (4, 10), (5, 6), (5, 7), (5, 8), (5, 9), 

(5, 10), (6, 7), (6, 8), (6, 9), (6, 10), (7, 8), 

(7, 9), (7, 10), (8, 9), (8, 10), (9, 10). The event 

(1, 2) is deemed the same as (2, 1), so that the 

order in which resistors are sampled does not 

matter. If alternatively the order does matter, 

so (1, 2) is considered different from (2, 1), 

then there are a total of 90 outcomes in the 

sample space. 

3) If in the preceding example, the sampling 

were performed with replacement, then the 

additional events (1, 1), (2, 2), (3, 3), (4, 4), 

(5, 5), (6, 6), (7, 7), (8, 8), (9, 9) and (10, 10) 

would also need to be included. In the case 

where ordering does not matter, there would 
be 55 outcomes in the sample space. In the 

ordering matters situation, there would be 100 

outcomes in the sample space. 


NOTES 


1 Outcomes could arise from an actual experiment or a 
completely hypothetical experiment. This set could be 
an explicit list, a countable set such as positive integers, 
{1, 2, 3, ... }, or the real line, for example. 


2 Sample space is the first component of a probability 
space (see 4.68). 
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4.2 Event A 
Subset of the sample space (see 4.1). 
Examples: 


1) Continuing with Example 1 of 4.1, the 
following are examples of events {0}, (0, 2), 
{5,7}, (7, +œ), corresponding to an initially 
failed battery, a battery that works initially but 
fails before two hours, a battery that fails at 
exactly 5,7 h and a battery that has not yet 
failed at 7h. The {0} and {5,7} are each sets 
containing a single value; (0, 2) is an open 
interval of the real line; (7, + c0) is a left closed 
infinite interval of the real line. 

2) Continuing with Example 2 of 4.1, restrict 

attention to selection without replacement and 

without recording the selection order. One 
possible event is A defined by {at least one of 
the resistors 1 or 2 is included in the sample}. 

This event contains the 17 outcomes (1, 2), (1, 

3), (1, 4), d, 5), (1, 6), (1, 7), (1, 8), (1, 9), CI, 

10), (2, 3), (2, 4), 2, 5), (2, 6), (2; 7), (2, 8), Q, 

9), and (2, 10). Another possible event B is {none 

of the resistors 8, 9 or 10 is included in the 

sample}. This event contains the 21 outcomes 

C1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (L, 7), (2; 3), 

(2, 4), (2, 5), (2, 6), 2, 7), G, 4), 6.5, G, 6), 

(3, 7), (4, 5), 4, 6), (4, 7), (5, 6), (5, 7), (6, 7). 

Continuing with Example 2, the intersection of 

events A and B (that is, that at least one of the 

resistors | and 2 is included in the sample, but 
none of the resistors 8, 9 and 10), contains the 

following 11 outcomes (1, 2), (1, 3), (1, 4), (1, 

5), (1,6), (1,7), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7). 


3) 


The union of the events A and B contains the following 
27 outcomes: (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (1, 7), 
(1, 8), (1, 9), (1, 10), (2, 3), (2, 4), (2, 5), (2, 6), (2, 7), 
(2, 8), (2, 9), (2, 10), (3, 4), (3, 5), (3, 6), (3, 7), (4, 5), 
(4, 6), (4, 7), (5, 6), (5, 7), and (6, 7). 


Incidentally, the number of outcomes in the union of 
the events A and B (that is, that at least one of the 
resistors 1 and 2 or none of the resistors 8, 9, and 10, is 
included in the sample) is 27 which also equals 17 + 
21-11, namely the number of outcomes in A plus the 
number of outcomes in B minus the number of outcomes 
in the intersection is equal to the number of outcomes 
in the union of the events. 


NOTE — Given an event and an outcome of an experiment, 
the event is said to have occurred, if the outcome belongs to 
the event. Events of practical interest will belong to the sigma 
algebra of events (see 4.69), the second component of the 
probability space (see 4.68). Events naturally occur in gambling 
contexts (poker, roulette, and so forth) where determining the 
number of outcomes that belong to an event determines the 
odds for betting. 
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4.3 Complementary Event A‘ 


It is the subset of the sample space that represents non- 
happening of the event, sample space (see 4.1) 
excluding the given event (see 4.2). 


Examples: 


1) Continuing with the battery Example 1 of 4.1, 
the complement of the event {0} is the event 
(O, +22) which is equivalent to the complement 
of the event that the battery did not function 
initially is the event that the battery did 
function initially. Similarly, the event (0, 3) 
corresponds to the cases that either the battery 
was not functioning initially or it did function 
less than three hours. The complement of this 
event is (3, co) which corresponds to the case 
that a battery was working at 3 h and its failure 
time is greater than this value. 


2) Continuing with Example 2 of 4.2. The 
number of outcomes in B can be found easily 
by considering the complementary event to 
B = (the sample contains at least one of the 
resistors 8, 9 or 10}. This event contains the 
7 +8 +9 = 24 outcomes (1, 8), (2, 8), (3, 8), 
(4, 8), (5, 8), (6, 8), (7, 8), (1, 9), (2, 9), (3, 9), 
(4, 9), (5, 9), (6, 9), (7, 9), (8, 9), (1, 10), 
(2, 10), (3, 10), (4, 10), (5, 10), (6, 10), (7, 10), 
(8, 10), (9, 10). As the entire sample space 
contains 45 outcomes in this case, the event 
B contains 45 — 24 = 21 outcomes [namely: 
(1, 2), (1, 3), (1,4), (1, 5), d, 6), (1, 7), 2, 3), 
(2, 4), (2, 5), (2, 6), (2, 7), G, 4), G, 5), G, 6), 
(3,7), (4, 5), (4, 6), (4, 7), (5, 6), (5, 7), (6, 7)]. 


NOTES 


1 The complementary event is the complement of the 
event in the sample space. 


2 The complementary event is also an event. 


3 For an event A, the complementary event to A is 
usually designated by the symbol A". 

4 In many situations, it may be easier to compute the 
probability of the complement of an event than the 
probability of the event. For example, the event defined 
by “at least one defect occurs in a sample of 10 items 
chosen at random from a population of 1 000 items, 
having an assumed one percent defectives” has a huge 
number of outcomes to be listed. The complement of 
this event (no defects found) is much easier to deal 
with. 


4.4 Independent Events 


Pair of events (see 4.2) such that the probability 
(see 4.5) of the intersection of the two events is the 
product of the individual probabilities 


Examples: 


1) Consider a two die tossing situation, with one 
red die and one white die so as to distinguish 


the 36 possible outcomes with probability 1/36 
assigned to each, D, is defined as the event 
where the sum of the dots on the red and white 
die is i. Wis defined as the event that the white 
die shows one dot. The events D, and W are 
independent, whereas the events D, and W are 
not independent for i = 2, 3, 4, 5 or 6. Events 
that are not independent are referred to as 
dependent events. 


2) Independent and dependent events arise 
naturally in applications. In cases where 
events or circumstances are dependent, it is 
quite useful to know of the outcome of a 
related event. For example, an individual 
about to undergo heart surgery could have 
very different prospects for success, if it is 
the case that this individual had a smoking 
history or other risk factors. Thus, smoking 
and death from invasive procedures could be 
dependent. In contrast, death would likely be 
independent of the day of the week that this 
person was born. In a reliability context, 
components having a common cause of 
failure do not have independent failure times. 
Fuel rods in a reactor have a presumably low 
probability of cracks occurring but given that 
a fuel rod cracks. the probability of an 
adjacent rod cracking may increase 
substantially. 


3) Continuing Example 2 of 4.2, assume that the 
sampling has been done by simple random 
sampling, such that all outcomes have the 
same probability 1/45. 


Then P(A) = 17/45 = 0.377 8, P(B) = 21/45 = 0.466 7 
and P(A and B) = 11/45 = 0.244 4. However, the product 
P(A) x P(B) = (17/45) x (21/45) = 0.176 3, which is 
different from 0.244 4, so the events A and B are not 
independent. 


NOTE — This definition is given in the context of two events 
but can be extended. For events A and B, the independence 
condition is P(A f B) = P (A) P(B ). For three events A, B and 
C to be independent, it is required that: 
P(A NBN C) = P(A)P(B)P(C), 

P(A N B) = P(A)P(B), 

P(A N C) = P(A)P(C), and 

P(B (N C) = P(B)P(C). 
In general, for more than two events, A,, A), ..., A, are 
independent if the probability of the intersection of any given 
subset of the events equals the product of the individual events, 
this condition holding for each and every subset. It is possible 
to construct an example in which each pair of events is 
independent, but the three events are not independent (that is 
pairwise, but not complete independence). 


4.5 Probability of an Event A P(A) 


Real number in the closed interval [0, 1] assigned to 
an event (2, 2) 


Example — Continuing with Example 2 of 4.1, the 
probability for an event can be found by adding the 
probabilities for all outcomes constituting the event. If 
all the 45 outcomes have the same probability, each of 
them will have the probability 1/45. The probability of 
an event can be found by counting the number of 
outcomes and dividing this number by 45. 


NOTES 


1 Probability measure (see 4.70) provides assignment of real 
numbers for every event of interest in the sample space. Taking 
an individual event, the assignment by the probability measure 
gives the probability associated with the event. In other words, 
probability measure yields the complete set of assignments for 
all of the events, whereas probability represents one specific 
assignment for an individual event. 

2 This definition refers to probability as probability of a specific 
event. Probability can be related to a long-run relative 
frequency of occurrences or to a degree of belief in the likely 
occurrence of an event. Typically, the probability of an event 
A is denoted by P(A). The notation p(A) using the script letter 
p is used in contexts where there is the need to explicitly 
consider the formality of a probability space (see 4.68). 


4.6 Conditional Probability P(AIB) 


Probability (see 4.5) of the intersection of A and B 
divided by the probability of B. 


Examples: 


1) Continuing the battery Example 1 of 4.1, 
consider the event (2,2) A defined as {the 
battery survives at least three hours}, namely 
(3, œ). Let the event B be defined as {the 
battery functioned initially}, namely (0, co). 
The conditional probability of A given B takes 
into account that one is dealing with the 
initially functional batteries. 


2) Continuing with Example 2 of 4.1, if the 
selection is without replacement, the 
probability of selecting resistor 2 in the second 
draw is equal to zero given that it has been 
selected in the first draw. If the probabilities 
are equal for all resistors to be selected, the 
probability for selecting resistor 2 in the 
second draw equals 0.111 1 given that it has 
not been selected in the first draw. 


3) Continuing with Example 2 of 4.1, if the 
selection is done with replacement and the 
probabilities are the same for all resistors to 
be selected within each draw, then the 
probability of selecting resistor 2 in the second 
draw will be 0.1 either if resistor 2 has been 
selected in the first draw or if itis not selected 
in the first draw. Thus the outcomes of the 
first and the second draw are independent 
events. 


NOTES 


1 The probability of the event B is required to be greater 
than zero. 
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2 “A given B" can be stated more fully as “the event A 
given the event B has occurred”. The vertical bar in the 
symbol for conditional probability is pronounced 
“given”. 

3 If the conditional probability of the event A given that 
the event B occurred is equal to the probability of A 
occurring, the events A and B are independent. In other 
words, the knowledge of occurrence of B suggests no 
adjustment to the probability of A. 


4.6.1 Mutually Exclusive Events 


Two events are said to be mutually exclusive if and 
only if occurrence of one event implies non-occurrence 
of the other events. 


4.7 Distribution Function of a Random Variable X 
F(x) 


Function of x giving the probability (see 4.5) of the 
event (see 4.2) (—ee, x). 


NOTES 


1 The interval (—ee, x) is the set of all values up to and 
including x. 

2 The distribution function completely describes the probability 
distribution (see 4.11) of the random variable (see 4.10). 
Classifications of distributions as well as classifications of 
random variables into discrete or continuous classes are based 
on classifications of distribution functions. 


3 Since random variables take values that are real numbers or 
ordered k-tuples of real numbers, it is implicit in the definition 
that x is also a real number or an ordered k-tuple of real 
numbers. The distribution function for a multivariate 
distribution (see 4.17) gives the probability (see 4.5) that each 
of the random variables of the multivariate distribution is less 
than or equal to a specified value. 


Notationally, a multivariate distribution function is given by 
F(x, x4, mag X,) = P[X, < x, X, < x, .., X, « x]. Also, a 
distribution function is non-decreasing. In a univariate setting, 
the distribution function is given by F(x) = P[X < x], which 
gives the probability of the event that the random variable X 
takes on a value less than or equal to x. 


4 Commonly, distribution functions are classified into discrete 
distribution (see 4.22) functions and continuous distribution 
(see 4.23) functions but there are other possibilities. Recalling 
the battery example of 4.1, one possible distribution function 
is, as follows: 


0 ifx«0 
F(x) 2401 ifx=0 
0.1 + 0.9 [1 -exp(-x)] ifx»0 


From this specification of the distribution function, battery life 
is non-negative. There is a 10 percent chance that the battery 
does not function on the initial attempt. If the battery does in 
fact function initially, then its battery life has an exponential 
distribution (see 4.58) with mean life of 1 h. 

5 Often the abbreviation cdf (cumulative distribution function) 
is given for distribution function. 


4.8 Family of Distributions 
Set of probability distributions (see 4.11). 


NOTES 


1 The set of probability distributions is often indexed by a 
parameter (see 4.9) of the probability distribution. 
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2 Often the mean (see 4.35) and/or the variance (see 4.36) of 
the probability distribution is used as the index of the family of 
distributions or as part of the index in cases where more than 
two parameters are needed to index the family of distributions. 
On other occasions, the mean and variance are not necessarily 
explicit parameters in the family of distributions but rather a 
function of the parameters. 


4.9 Parameter 


Index of a family of distributions (see 4.8). 


NOTES 
1 The parameter may be one-dimensional or multi-dimensional. 


2 Parameters are sometimes referred to as location parameters, 
particularly if the parameter corresponds directly to the mean 
of the family of distributions. Some parameters are described 
as scale parameters, particularly if they are exactly or 
proportional to the standard deviation (see 4.37) of the 
distribution. Parameters that are neither location nor scale 
parameters are generally referred to as shape parameters. 


4.10 Random Variable 


Function defined on a sample space (see 4.1) where 
the values of the function are ordered k-tuplets of real 
numbers. 


Example — Continuing the battery example introduced 
in 4.1, the sample space consists of events which are 
described in words (battery fails upon initial attempt, 
battery works initially but then fails at x hours). Such 
events are awkward to work with mathematically, so it 
is natural to associate with each event, the time (given 
as a real number) at which the battery fails. If the 
random variable takes the value 0, then one would 
recognize that this outcome corresponds to an initial 
failure. For a value of the random variable greater than 
zero, it would be understood that the battery initially 
worked and then subsequently failed at this specific 
value. The random variable representation allows one 
to answer questions such as, ‘what is the probability 
that the battery exceeds its warranty life, that is 6 h?’. 


NOTES 


1 An example of an ordered k-tuplet is (x,, x,, ..., x,). An ordered 
k-tuplet is, in other words, a vector in k dimensions (either a 
row or column vector). 

2 Typically, the random variable has dimension denoted by k. 
If k = 1, the random variable is said to be one-dimensional or 
univariate. For k > 1, the random variable is said to be multi- 
dimensional. In practice, when the dimension is a given number, 
k, the random variable is said to be k-dimensional. 


3 A one-dimensional random variable is a realvalued function 
defined on the sample space (see 4.1) that is part of a probability 
space (see 4.68). 

4A random variable with real values given as ordered pairs is 
said to be two-dimensional. The definition extends the ordered 
pair concept to ordered k-tuplets. 


5 The j^ component of a k-dimensional random variable is the 
random variable corresponding to only the j^" component of 
the k-tuplet. The j" component of a k-dimensional random 
variable corresponds to a probability space where events 
(see 4.2) are determined only in terms of values of the 
component considered. 
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4.11 Probability Distribution/Distribution 


Probability measure (see 4.70) induced by a random 
variable (see 4.10). 


Example — Continuing with the battery example from 
4.1, the distribution of battery life completely describes 
the probabilities with which specific values occur. It is 
not known with certainty what the failure time of a 
given battery will be nor is it known (prior to testing) 
if the battery will even function upon the initial attempt. 
The probability distribution completely describes the 
probabilistic nature of an uncertain outcome. In Note 4 
of 4.7, one possible representation of the probability 
distribution was given, namely a distribution function. 


NOTES 


1 There are numerous, equivalent mathematical representations 
of a distribution including distribution function (see 4.7), 
probability density function (see 4.27), if it exists, and 
characteristic function. With varying levels of difficulty, these 
representations allow for determining the probability with 
which a random variable takes values in a given region. 


2 Since a random variable is a function on subsets of the sample 
space to the real line, it is the case, for example, that the probability 
that a random variable takes on any real value is 1. For the battery 
example, P[X 2 0] = 1. In many situations, it is much easier to deal 
directly with the random variable and one of its representations 
than to be concerned with the underlying probability measure. 
However, in converting from one representation to another, the 
probability measure ensures the consistency. 


3 A random variable with a single component is called a one- 
dimensional or univariate probability distribution. If a random 
variable has two components, one speaks about a two- 
dimensional or bivariate probability distribution, and with more 
than two components, the random variable has a 
multidimensional or multivariate probability distribution. 


4.12 Expectation 


Integral of a function of a random variable (see 4.10) 
with respect to a probability measure (see 4.70) over 
the sample space (see 4.1). 


NOTES 


1 The expectation of the function g of a random variable X is 
denoted by E[g(X)] and is computed as: 


E[sco]- [sCodp 7 [ gd FO) 


where F(x) is the corresponding distribution function. 


2 The “E” in E[g(X)] comes from the “expected value" or 
"expectation" of the random variable X, E can be viewed as an 
operator or function that maps a random variable to the real 
line according to the above calculation. 


3 Two integrals are given for E[g(X)]. The first treats the 
integration over the sample space which is conceptually 
appealing but not of practical use, for reasons of clumsiness in 
dealing with events themselves (for example, if given verbally). 
The second integral depicts the calculation over the R*, which 
is of greater practical interest. 


4 In many cases of practical interest, the above integral reduces 
to a form recognizable from calculus. Examples are given in 
the notes to moment of order r (see 4.34) where g(x) 2 x', mean 
(see 4.35) where g(x) = x and variance (see 4.36) where g(x) = 


[x - EWF. 


5 The definition is not restricted to one dimensional integrals 
as the previous examples and notes might suggest. For higher 
dimensional situations, see 4.43. 


6 For a discrete random variable (see 4.28), the second integral 
in Note 1 is replaced by the summation symbol. Examples can 
be found in 4.35. 


4.13 p-quantile -p-fractile — Xp, x, 


Value of x equal to the infimum of all x such that the 
distribution function (see 4.7) F(x) is greater than or 
equal to p, forO<p< 1 


Examples: 


1) Consider a binomial distribution (see 4.46) with 
probability mass function given in Table 2. This set 
of values corresponds to a binomial distribution with 
parameters n = 6 and p = 0.3. For this case, some 
selected p-quantiles are: 


X9, 50 
X99551 

X9572 
X935 = 3 
Xo99 = 3 
Xo9s = 4 
Xo99 = 5 
Xo.999 = 5 


The discreteness of the binomial distribution leads 
to integral values of the p-quantiles. 


Table 2 Binomial Distribution Example 
(Example 1 Clause 4.13) 


SI No. x P[X-x]  P[X<x] P[X5x] 
i) 0 0.117649 0.117649 — 0.882351 
ii) 1 0.302526 0.420175 0.579 825 
iii) 2 0.324135 0.744310 0.255 690 
iv) 3 0.185220 0.929530 0.070470 
Y) 4 0.059535 0.989065 — 0.010 935 
vi) 5 0.010206 0.999271 0.000 729 

vii) 6 0.000 729 1.000000 0.000 000 


2) Consider a standardized normal distribution (see 4.51) 
with selected values from its distribution function given 
in Table 3. Some selected p-quantiles are: 


Table 3 Standardized Normal 
Distribution Example 
(Example 2 Clause 4.13) 


SI No. P x such that PX < x] = p 
i) 0.1 -1.282 
ii) 0.25 -0.674 
iii) 0.5 0.000 
iv) 0.75 0.674 
v) 0.841 344 75 1.000 
vi) 0.9 1.282 
vii) 0.95 1.645 
viii) 0.975 1.960 
ix) 0.99 2.326 
x) 0.995 2.576 
xi) 0.999 3.090 
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Since the distribution of X is continuous, the second 
column heading could also be: x such that P[X < x] 


NOTES 


1 For continuous distributions (see 4.23), if p is 0.5 then the 
0.5-quantile corresponds to the median (see 4.14). For p equal to 
0.25, the 0.25-quantile is known as the lower quartile. For 
continuous distributions, 25 percent of the distribution is below 
the 0.25-quantile while 75 percent is above the 0.25-quantile. For 
p equal to 0.75, the 0.75-quantile is known as the upper quartile. 
2 In general, 100 p percent of a distribution is below the 
p-quantile; 100(1 - p) percent of a distribution is above the 
p-quantile. There is a difficulty in defining the median for 
discrete distributions since it could be argued to have multiple 
values satisfying the definition. 

3 If F is continuous and strictly increasing, the p-quantile is 
the solution to F(x) = p. In this case, the word “infimum” in the 
definition could be replaced by “minimum”. 

4 If the distribution function is constant and equal to p in an 
interval, then all values in that interval are p-quantiles for F. 
5 p-quantiles are defined for univariate distributions 
(see 4.16). 


4.14 Median 
0.5-quantile (see 4.13) 
Example: 


For the battery example of Note 4 in 4.7, the median is 
0.587 8, which is the in 
0.1 + 0.9[1-exp(-x)] = 0.5 


NOTES 


1 The median is one of the most commonly applied p- 
quantiles (see 4.13) in practical use. The median of a 
continuous univariate distribution (see 4.16) is such that half 
of the population (see 3.1.1) is greater than or equal to the 
median and half of the population is less than or equal to 
the median. 


solution for x 


2 Medians are defined for univariate distributions (see 4.16). 


4.15 Quartile 
0.25-quantile (see 4.13) or 0.75-quantile. 
Example: 


Continuing with the battery example of 4.14, it can be 
shown that the 0.25-quantile is 0.182 3 and the 
0.75-quantile is 1.280 9. 


NOTES 


1 The 0.25 quantile is also known as the lower quartile, while 
the 0.75 quantile is also known as the upper quartile. 


2 Quartiles are defined for univariate distributions (see 4.16). 
4.16 Univariate Probability Distribution/Univariate 
Distribution 


Probability Distribution (see 4.11) of a single random 
variable (see 4.10). 
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NOTE — Univariate probability distributions are one 
dimensional. The binomial (see 4.46), Poisson (see 4.47), 
normal (see 4.50), gamma (see 4.56), t (see 4.53), Weibull 
(see 4.63) and beta (see 4.59) distributions are examples of 
univariate probability distributions. 


4.17 Multivariate Probability Distribution/ 
Multivariate Distribution 


Probability distribution (see 4.11) of two or more 
random variables (see 4.10). 


NOTES 


1 For probability distributions with exactly two random 
variables, the qualifier multivariate is often replaced by the 
more restrictive qualifier bivariate. As mentioned in the 
Foreword, the probability distribution of a single random 
variable can be explicitly called a one dimensional or univariate 
distribution (see 4.16). Since this situation is in preponderance, 
it is customary to presume a univariate situation unless 
otherwise stated. 


2 The multivariate distribution is sometimes referred to as the 
joint distribution. 


3 The multinomial distribution (see 4.45), bivariate normal 
distribution (see 4.65) and the multivariate normal distribution 
(see 4.64) are examples of multivariate probability distributions 
covered in this standard. 


4.18 Marginal Probability Distribution/Marginal 
Distribution 


It is obtained by marginalizing the effect of all the other 
variables present in the random experiment probability 
distribution (see 4.11) of a non-empty, strict subset of 
the components of a random variable (see 4.10). 


Examples: 


1) Foradistribution with three random variables 
X, Y and Z. there are three marginal 
distributions with two random variables, 
namely for (X, Y), (X, Z) and (Y, Z) and three 
marginal distributions with a single random 
variable, namely for X, Y and Z. 

2) Forthe bivariate normal distribution (see 4.65) 

of the pair of variables (X, Y), the distribution 

of each of the variables X and Y considered 
separately are marginal distributions, which 

are both normal distributions (see 4.50). 

3) For the multinomial distribution (see 4.45), 

the distribution of (X,, X,) is a marginal 

distribution if k5 3. The distributions of X,, 

X,, .... X,, separately are also marginal 

distributions. These marginal distributions are 


each binomial distributions (see 4.46). 


NOTES 


1 For a joint distribution in k dimensions, one example 
of a marginal distribution includes the probability 
distribution of a subset of k, < k random variables. 


2 Given a continuous (see 4.23) multivariate probability 
distribution (see 4.17) represented by its probability 
density function (see 4.26), the probability density 
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function of its marginal probability distribution is 
determined by integrating the probability density 
function over the domain of the variables that are not 
considered in the marginal distribution. 

3 Given a discrete (see 4.22) multivariate probability 
distribution represented by its probability mass function 
(see 4.24), the probability mass function of its marginal 
probability distribution is determined by summing the 
probability mass function over the domain of the 
variables that are not considered in the marginal 
distribution. 


4.19 Conditional Probability Distribution/ 
Conditional Distribution 


Probability distribution (see 4.11) restricted to a 
non-empty subset of the sample space (see 4.1) and 
adjusted to have total probability one on the restricted 
sample space. 


Examples: 


1) In the battery example of 4.7, Note 4, the 
conditional distribution of battery life given 
that the battery functions initially is 


exponential (see 4.58). 


2) For the bivariate normal distribution (see 


4.65), the conditional probability distribution 
of Y given that X = x reflects the impact on Y 
from knowledge of X. 


3) Consider a random variable X depicting the 


distribution of annual insured loss costs in 
Florida due to declared hurricane events. This 
distribution would have a non-zero probability 
of zero annual loss costs owing to the 
possibility that no hurricane impacts Florida 
in a given year. Of possible interest is the 
conditional distribution of loss costs for those 
years in which an event actually occurs. 


NOTES 


1 As an example for a distribution with two random 
variables X and Y. there are conditional distributions 
for X and conditional distributions for Y. A distribution 
of X conditioned through Y = y is denoted as 
“conditional distribution of X given Y = y", while a 
distribution of Y conditioned by X = x is denoted 
“conditional distribution of Y given X = x". 

2 Marginal probability distributions (see 4.18) can be 
viewed as unconditional distributions. 


3 Example 1 above illustrates the situation where a 
univariate distribution is adjusted through conditioning 
to yield another univariate distribution, which in this 
case is a different distribution. In contrast, for the 
exponential distribution, the conditional distribution 
that a failure will occur within the next hour, given that 
no failures have occurred during the first 10 h, is 
exponential with the same parameter. 


4 Conditional distributions can arise for certain discrete 
distributions where specific outcomes are impossible. 
For example, the Poisson distribution could serve as a 
model for number of cancer patients in a population of 
infected patients if conditioned on being strictly positive 
(a patient with no tumours is not by definition infected). 


5 Conditional distributions arise in the context of 
restricting the sample space to a particular subset. For 
(X, Y) having a bivariate normal distribution (see 4.65) 
it may be of interest to consider the conditional 
distribution of (X, Y) given that the outcome must occur 
in the unit square [0. 1] x [0. 1]. Another possibility is 
the conditional distribution of (X, Y) given that X? + Y? 
< r. This case corresponds to a situation where for 
example a part meets a tolerance and one might be 
interested in further properties based on achieving this 
performance. 


4.20 Regression Curve 


Itis the smooth free hand curve fitted to the set of paired 
data in the regression analysis collection of values of 
the expectation (see 4.12) of the conditional probability 
distribution (see 4.19) of a random variable (see 4.10) 
Y given a random variable X = x. 


NOTE — Here, regression curve is defined in the context of 
(X, Y) having a bivariate distribution (see Note 1 under 4.17). 
Hence, it is a different concept than those found in regression 
analysis in which Y is related to a deterministic set of 
independent values. 


4.21 Regression Surface 


Collection of values of the expectation (see 4.12) of 
the conditional probability distribution (see 4.19) of a 
random variable (see 4.10) Y given the random 
variables X, = x, and X, = x. 
NOTE — Here, as in 4.20, regression surface is defined in the 
context of (Y, X,, X, ) being a multivariate distribution 
(see 4.17). As with the regression curve, the regression surface 


involves a concept distinct from those found in regression 
analysis and response surface methodology. 


4.22 Discrete Probability Distribution 


Discrete distribution probability distribution (see 4.11) 
for which the sample space (see 4.1) is finite or 
countably infinite. 


Example — Examples of discrete distributions in this 
document are multinomial (see 4.45), binomial (see 
4.46), Poisson (see 4.47), hypergeometric (see 4.48) 
and negative binomial (see 4.49). 


NOTES 


1 "Discrete" implies that the sample space can be given in a 
finite list or the beginnings of an infinite list in which the 
subsequent pattern is apparent, such as the number of defects 
being 0, 1, 2, ... Additionally, the binomial distribution 
corresponds to a finite sample space (0, 1, 2, ..., n) whereas 
the Poisson distribution corresponds to a countably infinite 
sample space (0, 1, 2, ...]. 

2 Situations with attribute data in acceptance sampling involve 
discrete distributions. 

3 The distribution function (see 4.7) of a discrete distribution 
is discrete valued. 


4.23 Continuous Probability Distribution 
Continuous Distribution 


Probability distribution (see 4.11) for which the 
distribution function (see 4.7) evaluated at x can be 


19 


IS 7920 (Part 1) : 2012 


expressed as an integral of a non-negative function from 
— co to x. 


Example — Situations where continuous distributions 
occur are virtually any of those involving variables type 
data found in industrial applications. 


NOTES 


1 Examples of continuous distributions are normal (see 4.50), 
standardized normal (see 4.51), t (see 4.53), F (see 4.55), 
gamma (see 4.56), chi-squared (see 4.57), exponential 
(see 4.58), beta (see 4.59), uniform (see 4.60). Type I extreme 
value (see 4.61), Type II extreme value (see 4.62), Type III 
extreme value (see 4.63), and lognormal (see 4.52). 


2 The non-negative function referred to in the definition is the 
probability density function (see 4.26). It is unduly restrictive 
to insist that a distribution function be differentiable 
everywhere. However, for practical considerations, many 
commonly used continuous distributions enjoy the property 
that the derivative of the distribution function provides the 
corresponding probability density function. 


3 Situations with variables data in acceptance sampling 
applications correspond to continuous probability 
distributions. 


4.24 Probability Mass Function 


«Discrete distribution» function giving the probability 
(see 4.5) that a random variable (see 4.10) equals a 
given value 


Examples: 


1) The probability mass function describing the 
random variable X equal to the number of heads 
resulting from tossing three fair coins is: 


P(X = 0) = 1/8 
P(X = 1) 3/8 
P(X = 2) = 3/8 
P(X = 3) = 1/8 


2) Various probability mass functions are given 
in defining common discrete distributions 
(see 4.22) encountered in applications. 
Subsequent examples of univariate discrete 
distributions include the binomial (see 4.46), 
Poisson (see 4.47), hypergeometric (see 4.48) 
and negative binomial (see 4.49). An example 
of a multivariate discrete distribution is the 


multinomial (see 4.45). 


NOTES 


1 The probability mass function can be given as P(X = xj) = p, 
where X is the random variable, x, is a given value, and p, is the 
corresponding probability. 


2 A probability mass function was introduced in the p-quantile 
Example 1 of 4.13 using the binomial distribution (see 4.46). 


4.25 Mode of Probability Mass Function 


Value(s) where a probability mass function (4.24) 
attains a local maximum 


Example — The binomial distribution (see 4.46) with 
n = 6 and p = 1/3 is unimodal with mode at 3. 
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NOTE — A discrete distribution (see 4.22) is unimodal if its 
probability mass function has exactly one mode, bimodal if its 
probability mass function has exactly two modes and multi- 
modal if its probability mass function has more than two modes. 


4.26 Probability Density Function f(x) 


It is used for continuous distribution; it is a function 
that gives the probability that a random variable lies in 
a given range, non-negative function which when 
integrated from — to x gives the distribution function 
(see 4.7) evaluated at x of a continuous distribution 
(see 4.23). 


Examples: 


1) Various probability density functions are 
given in defining the common probability 
distributions encountered in practice. 
Subsequent examples include the normal 
(see 4.50), standardized normal (see 4.51), t 
(see 4.53), F (see 4.55), gamma (see 4.56), 
chi-squared (see 4.57), exponential (see 4.58), 
beta (see 4.59), uniform (see 4.60), 
multivariate normal (see 4.64) and bivariate 
normal distributions (see 4.65). 


2) For the distribution function defined by 
F(x) = 3x? - 2x? where OS x < 1, the 
corresponding probability density function is 
f(x) = 6x(1 - x) where OSxS 1. 

3) Continuing with the battery example of 4.1, 
there does not exist a probability density 
function associated with the specified 
distribution function, owing to the positive 
probability of a zero outcome. However, the 
conditional distribution given that the battery 
is initially functioning has f(x) = exp(-x) for x 
> 0 as its probability density function, which 
corresponds to the exponential distribution. 


NOTES 


1 If the distribution function, F is continuously 
differentiable, then the probability density function is 
f(x) = dF(x)/dx at the points x where the derivative exists. 


2 A graphical plot of f(x) versus x suggests descriptions 
such as symmetric, peaked, heavy-tailed, unimodal, bi- 
modal and so forth. A plot of a fitted f(x) overlaid on a 
histogram provides a visual assessment of the agreement 
between a fitted distribution and the data. 


3 A common abbreviation of probability density 
function is pdf. 


4.27 Mode of Probability Density Function 


Value(s) where a probability density function (see 4.26) 
attains a local maximum. 


NOTES 


1 A continuous distribution (see 4.23) is unimodal if its 
probability density function has exactly one mode, bimodal if 
its probability density function has exactly two modes and 
multi-modal if its probability density function has more than 
two modes. 


2 A distribution where the modes constitute a connected set is 
also said to be unimodal. 


4.28 Discrete Random Variable 


Random Variable (see 4.10) having a discrete 
distribution (see 4.22). 


NOTE — Discrete random variables considered in this standard 
include the binomial (see 4.46), Poisson (see 4.47), hypergeometric 
(see 4.48) and multinomial (see 4.45) random variables. 


4.29 Continuous Random Variable 


Random Variable (see 4.10) having a continuous 
distribution (see 4.23). 


NOTE — Continuous random variables considered in this 
standard include the normal (see 4.50), standardized normal 
(see 4.51), t distribution (see 4.53), F distribution (see 4.55), 
gamma (see 4.56), chi-squared (see 4.57), exponential 
(see 4.58), beta (see 4.59), uniform (see 4.60), Type I extreme 
value (see 4.61), Type II extreme value (see 4.62), Type III 
extreme value (see 4.63), lognormal (see 4.52), multivariate 
normal (see 4.64) and bivariate normal (see 4.65). 


4.30 Centred Probability Distribution 


Probability Distribution (see 4.11) of a centred random 
variable (see 4.31). 


4.31 Centred Random Variable 


Random variable (see 4.10) with its mean (see 4.35) 
subtracted. 


NOTES 
1A centred random variable has mean equal to zero. 


2 This term only applies to random variables with a mean. For 
example, the mean of the t distribution (see 4.53) with one 
degree of freedom does not exist. 


3 If a random variable X has a mean (see 4.35) equal to u, the 
corresponding centred random variable is X — u, having mean 
equal to zero. 


4.32 Standardized Probability Distribution 


Probability Distribution (see 4.11) of a standardized 
random variable (see 4.33). 


4.33 Standardized Random Variable 


Centred Random Variable (see 4.31) whose standard 
deviation (see 4.37) is equal to 1. 


NOTES 

1A random variable (see 4.10) is automatically standardized 
if its mean is zero and its standard deviation is 1. The uniform 
distribution (see 4.60) on the interval (- 3°°, 325) has mean 
zero and standard deviation equal to 1. The standardized normal 
distribution (see 4.51) is, of course, standardized. 

2 If the distribution (see 4.11) of the random variable X has 
mean (see 4.35) p and standard deviation o, then the 
corresponding standardized random variable is (X — p1)/o. 


4.34 Moment of Order r, r" moment 


Expectation (see 4.12) of the rt” power of a random 
variable (see 4.10). 


Example: 


Consider a random variable having probability density 
function (see 4.26) f(x) = exp(-x) for x > 0. Using 
integration by parts from elementary calculus, it can 
be shown that E(X) = 1, E(X?) = 2, E(X3) = 6, and 
E(X^) = 24, or in general, E(X") = r! This is an example 
of the exponential distribution (see 4.58). 


NOTES 


1 In the univariate discrete case, the appropriate formula is: 
EX) =$ xpi) 
i=l 
for a infinite number n of outcomes and 
B(x") => x pla) 


i=l 


for a countably infinite number of outcomes. In the univariate 
continuous case, the appropriate formula is: 


ga | x f(x)dx 


2 If the random variable has dimension k, then the r^ power is 
understood to be applied componentwise. 


3 The moments given here use a random variable X raised to a 
power. More generally, one could consider moments of order r 
of X - p or (X - u y/o. 


4.35 Means 
4.35.1 Mean — Moment of Order r = 1, u 


«Continuous distribution» moment of order r where r 
equals 1, calculated as the integral of the product of x 
and the probability density function (see 4.26), f(x), 
over the real line. 


Examples: 


1) Consider a continuous random variable 
(see 4.29) X having probability density 
function f(x) = 6x(1 — x), where 0 € x € 1. The 
mean of X is: 


fora caesis 


2) Continuing with the battery example from 4.1 
and 4.7, the mean is 0.9 since with probability 
0.1 the mean of the discrete part of the 
distribution is 0 and with probability 0.9 the 
mean of the continuous part of the distribution 
is 1. This distribution is a mixture of 
continuous and discrete distributions. 


NOTES 


1 The mean of a continuous distribution (see 4.23) is denoted 
by E(X) and is computed as: 


Eoo- xf (x)dx 


2 The mean does not exist for all random variables (see 4.10). 
For example, if X is defined by its probability density function 
fx) = [n(1 + x2), the integral corresponding to E(X) is 
divergent. 


IS 7920 (Part 1) : 2012 


4.35.2 Mean u 


<Discrete distribution> summation of the product of x; 
and the probability mass function (see 4.24) p(x;). 


Examples: 


1) Consider a discrete random variable X 
(see 4.28) representing the number of heads 
resulting from the tossing of three fair coins. 
The probability mass function is 


P(X = 0) = 1/8 
P(X=1)=3/8 
P(X=2)=3/8 
P(X=3)=1/8 


Hence, the mean of X is 
0(1/8) + 13/8) + 2(3/8) + 3(1/8) = 12/8 = 1.5 
2) See Example 2 in 4.35.1. 


NOTE — The mean of a discrete distribution (see 4.22) 
is denoted by E(X) and is computed as: 


E(X)= Y x, p(x) 


i=l 


for a finite number of outcomes, and 
E(X)=D x, pla) 
i=l 
for a countably infinite number of outcomes. 


4.36 Variance V 


Moment of order r (see 4.34) where r equals 2 in the 
centred probability distribution (see 4.30) of the 
random variable (see 4.10). 


Examples: 


1) For the discrete random variable (see 4.28) 
in the example of 4.24 the variance is 


Sa, —1.5) P(X = x) 2 0.75 


i=0 


2) For the continuous random variable (see 4.29) 
in Example 1 of 4.26, the variance is 


J, (x, -0.5)° 6x — x)dx = 0.05 


3) For the battery example of 4.1, the variance 
can be determined by recognizing that the 
variance of X is E(X?) - [E(X)P. From 
Example 3 of 4.35, E(X) = 0.9. Using the same 
type of conditioning argument, E(X?) can be 
shown to be 1.8. Thus, the variance of X is 
1.8 — (0.9)? which equals 0.99. 


NOTE — The variance can equivalently be defined as 
the expectation (see 4.12) of the square of the random 
variable minus its mean (see 4.35). The variance of a 
random variable X is denoted by V(X) = E([X-E(X)]?). 
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4.37 Standard Deviation, 6 
Positive square root of the variance (see 4.36). 


Example — For the battery example of 4.1 and 4.7, the 
standard deviation is 0.995. 


4.38 Coefficient of Variation, CV 


<Positive random variable> standard deviation 
(see 4.37) divided by the mean (see 4.35). 


Example — For the battery example of 4.1 and 4.7, the 
coefficient of variation is 0.99/0.995 which equals 
0.994 97. 


NOTES 


1 The coefficient of variation is commonly reported as a 
percentage. 


2 The predecessor term “relative standard deviation” is 
deprecated by the term coefficient of variation. 


4.39 Coefficient of Skewness, Y, 


Moment of order 3 (see 4.34) in the standardized 
probability distribution (see 4.32) of a random variable 
(see 4.10). 


Example — Continuing with the battery example of 
4.1 and 4.7 having a mixed continuous-discrete 
distribution, one has, using results from the example in 
4.34. 


EX) =0.1(0)+0.9(1) 20.9 
E (X2) =0.1(02) + 0.92) = 1.8 
E(X?) =0.1(0) + 0.9(6) 25.4 
E (X ^) 2 0.1(0) + 0.9(24) = 21.6 


To compute the coefficient of skewness, note that E 
{IX — EQOP] = EX °) - 3 EX) EX?) +2[EQ)P and 
from 4.37 the standard deviation is 0.995. The 
coefficient of skewness is thus [5.4 — 3(0.9)(1.8) + 
2(0.9)3]/(0.995) or 1.998. 


NOTES 

1 An equivalent definition is based on the expectation (see 4.12) 
of the third power of (X — u)/c, namely E[(X — u)/o?]. 

2 The coefficient of skewness is a measure of the symmetry of 
a distribution (see 4.11) and is sometimes denoted by JB.. For 
symmetric distributions, the coefficient of skewness is equal 
to 0 (provided the appropriate moments in the definition exist). 
Examples of distributions with skewness equal to zero include 
the normal distribution (see 4.50), the beta distribution 
(see 4.59) provided a = D and the t distribution (see 4.53) 
provided the moments exist. 


4.40 Coefficient of Kurtosis, B, 


Moment of order 4 (see 4.34) in the standardized 
probability distribution (see 4.32) of a random variable 
(see 4.10). 


Example — Continuing with the battery example of 
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4.1 and 4.7, to compute the coefficient of kurtosis, note 
that 


E([X - EQOY'] = EXÒ-4E (X) EX”) + 6LEQOT? 
E(X?) - 3 [EGO 


The coefficient of kurtosis is thus 


[21.6 — 4(0.9)(5.4) + 6(0.9)*(2) — 3(0.9)^]/(0.995)* 
or 8.94. 


NOTES 


1 An equivalent definition is based on the expectation (see 4.12) 
of the fourth power of (X — p)/o, namely E[(X — u)/67]. 

2 The coefficient of kurtosis is a measure of the heaviness of 
the tails of a distribution (see 4.11). For the uniform distribution 
(see 4.60), the coefficient of kurtosis is 1.8; for the normal 
distribution (see 4.50), the coefficient of kurtosis is 3; for the 
exponential distribution (see 4.58), the coefficient of kurtosis 
is 9. 

3 Caution needs to be exercised in considering reported kurtosis 
values, as some practitioners subtract 3 (the kurtosis of the 
normal distribution) from the value that is computed from the 
definition. 


4.41 Joint Moment of Orders r and s 


Mean (see 4.35) of the product of the r^ power of a 
random variable (see 4.10) and the s" power of another 


random variable in their joint probability distribution 
(see 4.11). 


4.42 Joint Central Moment of Orders r and s 


Mean (see 4.35) of the product of the r^ power of a 
centred random variable (see 4.31) and the s power 
of another centred random variable in their joint 
probability distribution (see 4.11). 


4.43 Covariance, Oyy 


Mean (see 4.35) of the product of two centred random 
variables (see 4.31) in their joint probability distribution 
(see 4.11). 


NOTES 


1 The covariance is the joint central moment of orders 1 and 1 
(see 4.42) for two random variables. 


2 Notationally, the covariance is 


ERX - PIY- uy)]. 
where E(X) = uy and E(Y) = py. 


4.44 Correlation Coefficient 


Mean (see 4.35) of the product of two standardized 
random variables (see 4.33) in their joint probability 
distribution (see 4.11). 


NOTE — Correlation coefficient is sometimes more briefly 
referred to as simply correlation. However, this usage overlaps 
with interpretations of correlation as an association between 
two variables. 


4.45 Multinomial Distribution 


Discrete distribution (see 4.22) having the probability 
mass function (see 4.24). 


P(X, =x,,X,=X,...,X, =X, ) 


n! pag 


Xi; X 
ENT Pı Pg -PK 
x la, baa! 


are non-negative integers 

such that 

with parameters p; » O for all 

i=1,2,...,kwithp,+p,+ 

AP =l 

k an integer greater than or 
equal to 2 

NOTE -- The multinomial distribution gives the probability 

of the number of times each of k possible outcomes have 

occurred in n independent trials where each trial has the same 


k mutually exclusive events and the probabilities of the events 
are the same for all trials. 


4.46 Binomial Distribution 


Discrete distribution (see 4.22) having the probability 
mass function (see 4.24). 


n! X(1. yx 
TO (1-p) 


where x = 0, 1, 2, .... n and with indexing parameters 
n=1,2,...and0«p« I. 


Example — 'The probability mass function described 
in Example 1 of 4.24 can be seen to correspond to the 
binomial distribution with index parameters n = 3 and 
p-0.5. 


NOTES 
1 The binomial distribution is a special case of the multinomial 
distribution (see 4.45) with k = 2. 


2 The binomial distribution gives the probability of the number 
of times each of two possible outcomes have occurred in n 
independent trials where each trial has the same two mutually 
exclusive events (see 4.2) and the probabilities (see 4.5) of the 
events are the same for all trials. 


3 The mean (see 4.35) of the binomial distribution equals np. 
The variance (see 4.36) of the binomial distribution equals 


np(l — p). 


4 The binomial probability mass function may be alternately 
expressed using the binomial coefficient given by 


nj n! 
x ^ x\(n—x)! 


4.47 Poisson Distribution 


Discrete distribution (see 4.22) having the probability 
mass function (see 4.24). 
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Bx-5-^ 


where x = 0, 1, 2, ... and with parameter A > 0. 


NOTES 

1 The limit of the binomial distribution (see 4.46) as n 
approaches ee and p tends to zero in such a way that np tends to 
À is the Poisson distribution with parameter À. 

2 The mean (see 4.35) and the variance (see 4.36) of the Poisson 
distribution are both equal to A. 

3 The probability mass function (see 4.24) of the Poisson 
distribution gives the probability for the number of occurrences 
of a property of a process in a time interval of unit length 
satisfying certain conditions. for example intensity of 
occurrence independent of time. 


4.48 Hypergeometric Distribution 


Discrete distribution (see 4.22) having the probability 
mass function (see 4.24). 


M! (N-M)! 
x!(M—x)! Jl (n—x)!(N-M-—n+x)! 
N! 

n'(N—n)! 


P(X=x)= 


where maximum(0. M — N) € x < minimum (M, n) with 
integer parameters 


N31,2,... 
M=O,1,2,...., N-1;n=1,2,.....N 


NOTES 

1 The hypergeometric distribution (see 4.11) arises as the 
number of marked items in a simple random sample (see 3.1.7) 
of size n, taken without replacement. from a population (or lot) 
of size N containing exactly M marked items. 


2 An understanding of the hypergeometric distribution may be 
facilitated with Table 4. 


Table 4 Hypergeometric Distribution Example 


(Note 2) 
SI Reference Set Marked or Marked Unmarked 
No. Unmarked Items Items 
Items 
i) Population N M N-M 
ii) Items in sample n x N-x 
iii) Items not in N-n M-x N-n-M+ 
sample x 


3 Under certain conditions (for example, n is small relative to 
N), then the hypergeometric distribution can be approximated 
by the binomial distribution with n and p = M/N. 

4 The mean (see 4.35) of the hypergeometric distribution equals 
(nM)/N. The variance (see 4.36) of the hypergeometric 
distribution equals 
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4.49 Negative Binomial Distribution 


Discrete distribution (see 4.22) having the probability 
mass function (see 4.24). 


_(ct+x-D! . 


P(X =x) en!” -py 


where x = 0, 1, 2, ..., n with parameter c > 0 and 
parameter p satisfying 0 « p < l. 


NOTES 


1 If c = 1, the negative binomial distribution is known as the 
geometric distribution and describes the probability (see 4.5) 
that the first incident of the event (see 4.2) whose probability 
is p, will occur in trial (x + 1). 

2 The probability mass function may also be written in the 
following, equivalent way: 


SÈN... . 
Pete |ea-n 

x 
The term “negative binomial distribution” emerges from this 
way of writing the probability mass function. 
3 The version of the probability mass function given in the 
definition is often called the “Pascal distribution” provided c 
is an integer greater than or equal to 1. In that case, the 
probability mass function describes the probability that the c'^ 
incident of the event (see 4.2), whose probability (see 4.5) is 
p, occurs in trial (c + x). 
4 The mean (see 4.35) of the negative binomial distribution is 
(cp)/(1 — p). The variance (see 4.36) of the negative binomial 
is (cp)/(1 — py. 


4.50 Normal Distribution; Gaussian Distribution 


Continuous distribution (see 4.23) having the 
probability density function (see 4.26). 


1 5 
deum 


where — < x < co and with parameters —co < u < co 
and o > 0. 


NOTES 


1 The normal distribution is one of the most widely used 
probability distributions (see 4.11) in applied statistics. Owing 
to the shape of the density function, it is informally referred to 
as the "bell-shaped" curve. Aside from serving as a model for 
random phenomena, it arises as the limiting distribution of 
averages (see 3.1.15). As a reference distribution in statistics, 
it is widely used to assess the unusualness of experimental 
outcomes. 


2 The location parameter u is the mean (see 4.35) and the scale 
parameter G is the standard deviation (see 4.37) of the normal 
distribution. 


4.51 Standardized Normal Distribution; 
Standardized Gaussian Distribution 
Normal distribution (see 4.50) with u = 0 and o = 1 


NOTE — The probability density function (see 4.26) of the 
standardized normal distribution is 


1 e? 


JO= pe 


where -œ < x < co . Tables of the normal distribution involve 
this probability density function, giving for example, the area 
under f for values in (-co,0o). 


4.52 Lognormal Distribution 


Continuous distribution (see 4.23) having the 
probability density function (see 4.26). 


_ (nxp? 
20° 


1 
dwan 


where x > 0 and with parameters -co < u < œ and o > 0. 


NOTES 


1 If Y has a normal distribution (see 4.50) with mean (see 4.35) 
u and standard deviation (see 4.37) o, then the transformation 
given by X = exp(Y) has the probability density function given 
in the definition. If X has a lognormal distribution with density 
function as given in the definition, then In(X) has a normal 
distribution with mean p and standard deviation o. 


2 The mean of the lognormal distribution is exp[u + (02)/2] 
and the variance is exp(2u + 0?) x [exp(o?) — 1]. This indicates 
that the mean and variance of the lognormal distribution are 
functions of the parameters u and o”. 


3 The lognormal distribution and Weibull distribution (see 4.63) 
are commonly used in reliability applications. 


4.53 t Distribution; Student's Distribution 
Continuous distribution (see 4.23) having the 


probability density function (see 4.26). 


TN T[(v+1/2] j 


2 N2472 
= 1 
a Vav G(v/2) UJ 


where —ee < ft < œ and with parameter v, a positive 
integer. 


NOTES 


1 The ¢ distribution is widely used in practice to evaluate the 
sample mean (see 3.1.15) in the common case where the 
population standard deviation is estimated from the data. The 
sample t statistic can be compared to the f distribution with 
n — 1 degrees of freedom to assess a specified mean as a 
depiction of the true population mean. 

2 The t distribution arises as the distribution of the quotient of 
two independent random variables (see 4.10), the numerator 
of which has a standardized normal distribution (see 4.51) and 
the denominator is distributed as the positive square root of a 
chi-squared distribution (see 4.57) after dividing by its degrees 
of freedom. The parameter is referred to as the degrees of 
freedom (see 4.54). 


3 The gamma function is defined in 4.56. 


4.54 Degrees of Freedom v 


Number of terms in a sum minus the number of 
constraints on the terms of the sum. 


NOTE — This concept was previously encountered in the 
context of using n- 1 in the denominator of the estimator 


(see 3.1.12) of the sample variance (see 3.1.16). The number 
of degrees of freedom is used to modify parameters. The term 
degrees of freedom is also widely used in ISO 3534-3 where 
mean squares are given as sums of squares divided by the 
appropriate degrees of freedom. 


4.55 F Distribution 


Continuous distribution (see 4.23) having the 
probability density function (see 4.26). 


(/2)-1 


_ T[ *v/2] 
I(v,/2)I(v,/2) 


(v, yo (v pu 


(vx * v, 


f(x) 


yn )/2 


where 


x50 

v, and v, are positive integers 

T is the gamma function defined in 4.56. 
NOTES 


1 The F distribution is a useful reference distribution for 
assessing the ratio of independent variances (see 4.36). 


2 The F distribution arises as the distribution of the quotient 
of two independent random variables each having a chi-squared 
distribution (see 4.57), divided by its degrees of freedom 
(see 4.54). The parameter v, is the numerator degrees of 
freedom and v, is the denominator degrees of freedom of the 
F distribution. 


4.56 Gamma Distribution 


Continuous distribution (see 4.23) having the 
probability density function (see 4.26). 
xp P 


1007 FT) 


where x > 0 and parameters o > 0, B > 0. 


NOTES 


1 The gamma distribution is used in reliability applications 
for modelling time to failure. It includes the exponential 
distribution (see 4.58) as a special case as well as other cases 
with failure rates that increase with age. 


2 The gamma function is defined by 
T(a)= [xe "dx 
0 


For integer values of I'(a) 2 (a— 1)! 
3 The mean (see 4.35) of the gamma distribution is of. The 


variance (see 4.36) of the gamma distribution is a(f?. 


4.57 Chi-squared Distribution X? Distribution 
Continuous distribution (see 4.23) having the 


probability density function (see 4.26). 


xt -a/2 


22 625 


where x » 0 and with v » 0. 


FG) 
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NOTES 


1 For data arising from a normal distribution (see 4.50) with 
known standard deviation (see 4.37) o, the statistic 1S7/0? has 
a chi-squared distribution with (n — 1) degrees of freedom. This 
result is the basis for obtaining confidence intervals for ©”. 
Another area of application for the chi-squared distribution is 
as the reference distribution for goodness of fit tests. 


2 This distribution is a special case of the gamma distribution 
(see 4.56) with parameters 4 = v/2 and B = 2. The parameter is 
referred to as the degrees of freedom (see 4.54). 


3 The mean (see 4.35) of the chi-squared distribution is v. The 
variance (see 4.36) of the chi-squared distribution is 2v. 


4.58 Exponential Distribution 


Continuous distribution (see 4.23) having the 
probability density function (see 4.26). 


faz pi et 
where x > 0 and with parameter p > 0. 


NOTES 


1 The exponential distribution provides a baseline in reliability 
applications, corresponding to the case of “lack of aging” or 
memory-less property. 

2 The exponential distribution is a special case of the gamma 
distribution (see 4.56) with œ = 1 or equivalently, the chi- 
squared distribution (see 4.57) with v = 2. 


3 The mean (see 4.35) of the exponential distribution is 8. The 
variance (see 4.36) of the exponential distribution is ?. 


4.59 Beta Distribution 


Continuous distribution (see 4.23) having the 
probability density function (see 4.26). 


_ T(a+B) 
IXo) I(B) 


where OS x <1 and with parameters a, B > 0. 


xt! a xf 


f(x) 


NOTE — The beta distribution is highly flexible, having a 
probability density function that has a variety of shapes 
(unimodal. “j”-shaped. “u”-shaped). The distribution can be 
used as a model of the uncertainty associated with a proportion. 
For example, in an insurance hurricane modelling application, 
the expected proportion of damage on a type of structure for a 
given wind speed might be 0.40, although not all houses 
experiencing this wind field will accrue the same damage. A 
beta distribution with mean 0.40 could serve to model the 
disparity in damage to this type of structure. 


4.60 Uniform Distribution/Rectangular Distribution 


Continuous distribution (see 4.23) having the 
probability density function (see 4.26). 


1 
b-a 


f(x)= 


where a € x € b. 


NOTES 


1 The uniform distribution with a = 0 and b = 1 is the underlying 
distribution for typical random number generators. 
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2 The mean (see 4.35) of the uniform distribution is (a+b)/2. 
The variance (see 4.36) of the uniform distribution is (b — a)/12. 


3 The uniform distribution is a special case of the beta 
distribution with a = 1 and B = 1. 


4.61 Type I Extreme Value Distribution/Gumbel 
Distribution 


Continuous distribution (see 4.23) having the 
distribution function (see 4.7). 


Lg alb 


F(x) =e 
where -co < x < co with parameters -o < a < eo, b > 0. 


NOTE — Extreme value distributions provide appropriate 
reference distributions for the extreme order statistics (see 3.1.9) 


X, and X,,,. The three possible limiting distributions as n tends 


to œ are provided by the three types of extreme value 
distributions given in 4.61, 4.62 and 4.63. 


4.62 Type II Extreme Value Distribution/Fréchet 
Distribution 


Continuous distribution (see 4.23) having the 
distribution function (see 4.7). 


F(x)= AF) 


where x > a and with parameters -oxax ee, 
b>0,k>0. 


4.63 Type III Extreme Value Distribution/Weibull 
Distribution 


Continuous distribution (see 4.23) having distribution 
function (see 4.7). 


F(x)=1- AF) 


where x > a with parameters -o < a < œ, b» 0, k 50 


NOTES 


1 In addition to serving as one of the three possible limiting 
distributions of extreme order statistics. the Weibull 
distribution occupies a prominent place in diverse 
applications, particularly reliability and engineering. The 
Weibull distribution has been demonstrated to provide 
empirical fits to a variety of data sets. 

2 The parameter a is a location parameter in the sense that is 
the minimum value that the Weibull distribution can achieve. 
The parameter b is a scale parameter [related to the standard 
deviation (see 4.37) of the Weibull distribution]. The parameter 
kis a shape parameter. 


3 For k = 1, the Weibull distribution is seen to include the 
exponential distribution. Raising an exponential distribution 
with a = 0 and parameter b to the power 1/k produces the Weibull 
distribution in the definition. Another special case of the Weibull 
distribution is the Rayleigh distribution (for a = 0 and k = 2). 


4.64 Multivariate Normal Distribution 


Continuous distribution (see 4.23) having the 
probability density function (see 4.26). 


26 


—n/2 Ga)" Stn) 
2 


f=” E 


where 
-oo < X, < co for each i; 
u is an n-dimensional parameter vector; 


Lis an n x n symmetric, positive definite matrix of 
parameters; and 

the boldface indicates n-dimensional vectors. 
NOTE — Each of the marginal distributions (see 4.18) of the 
multivariate distribution in this clause has a normal distribution. 
However, there are many other multivariate distributions having 


normal marginal distributions besides the version of the 
distribution given in this clause. 


4.65 Bivariate Normal Distribution 


Continuous distribution (see 4.23) having the 
probability density function (see 4.26). 


1 1 X-U 
SGY) - exp 
256,6,A|1-p” 20-p*) a, 
2 
2p XU, y L, Fa y M,, 
O, O, O, 
where 
—co < X < oo, 
—0o < y < 00, 
-oo KU, < 00, 
-ee «ji < eo, 
0,>0 
o> 0 
|pl«1 
NOTES 


1 As the notation suggests, for (X,Y) having the above 
probability density function (see 4.26), EX) = u,, EY) = u,, 
V(X) = a, V(Y) = a” and p is the correlation coefficient (see 
4.44) between X and Y. 


2 The marginal distributions of the bivariate normal distribution 
have a normal distribution. The conditional distribution of X 
given Y = y is normally distributed as is the conditional 
distribution of Y given X = x. 


4.66 Standardized Bivariate Normal Distribution 


bivariate normal distribution (see 4.65) having 
standardized normal distribution (see 4.51) components. 


4.67 Sampling Distribution 
Distribution of a statistic. 


NOTE — Illustrations of specific sampling distributions are 
given in Note 2 of 4.53, Note 1 of 4.55 and Note 1 of 4.57. 


4.68 Probability Space (OQ, X, 4) 


Sample space (see 4.1), an associated sigma algebra of 
events (see 4.69), and a probability measure (see 4.70). 


Examples: 


1) As a simple case, the sample space could 
consist of all the 105 items manufactured in a 
specified day at a plant. The sigma algebra of 
events consists of all possible subsets. Such 
events include {no items}, {item 1), {item 
2}, ... {item 105}, {item 1 and item 2), ..., 
{all 105 items}. One possible probability 
measure could be defined as the number of 
items in an event divided by the total number 
of manufactured items. For example, the event 
{item 4, item 27, item 92} has probability 
measure 3/105. 

2) As a second example, consider battery 
lifetimes. If the batteries arrive in the hands 
of the customer and they have no power, the 
survival time is O h. If the batteries are 
functional, then their survival times follow 
some probability distribution (see 4.11), such 
as an exponential (see 4.58). The collection 
of survival times is then governed by a 
distribution that is a mixture between discrete 
(the proportion of batteries that are not 
functional to begin with) and continuous (an 
actual survival time). For simplicity in this 
example, it is assumed that the lifetimes of 
the batteries are relatively short compared to 
the study time and that all survival times are 
measured on the continuum. Of course, in 
practice the possibility of right or left censored 
survival times (for example, the failure time 
is known to be at least 5 h or the failure time 
is between 3 and 3, 5 h) could occur. in which 
case, further advantages of this structure would 
emerge. The sample space consists of half of 
the real line (real numbers greater than or equal 
to zero). The sigma algebra of events includes 
all intervals of the form (0, x) and the set {0}. 
Additionally, the sigma algebra includes all 
countable unions and intersections of these 
sets. The probability measure involves 
determining for each set, its constituents that 
represent non-functional batteries and those 
having a positive survival time. Details on the 
computations associated with the failure times 
have been given throughout this clause where 
appropriate. 


4.69 Sigma Algebra of Events/o-Algebra/Sigma 
Field/o-Field/X 
Set of events (see 4.2) with the properties: 


a) 
b) 


belongs to X; 

If an event belongs to N, then its 
complementary event (see 4.3) also belongs 
to N; 


27 


IS 7920 (Part 1) : 2012 


c) Tf(A,) is any set of events in X, then the union 
Ut ,A, and the intersection ^, A, of the events 
belong to N. 
Examples: 


1) Ifthe sample space is the set of integers, then 
a sigma algebra of events may be chosen to 
be the set of all subsets of the integers. 

2) Ifthe sample space is the set of real numbers. 

then a sigma algebra of events may be chosen 

to include all sets corresponding to intervals 
on the real line and all their finite and 
countable unions and intersections of these 
intervals. This example can be extended to 
higher dimensions by considering 
dimensional “intervals.” In particular, in two 
dimensions, the set of intervals could consist 
of regions defined by ((x,y): x < s, y < t) for 
all real values of s and t. 


NOTES 


1A sigma algebra is a set consisting of sets as its members. 
The set of all possible outcomes Q is a member of the sigma 
algebra of events, as indicated in property (a). 


2 Property (c) involves set operations on a collection of subsets 
(possibly countably infinite) of the sigma algebra of events. 
The notation given indicates that all countable unions and 
intersections of these sets also belong to the sigma algebra of 
events. 


3 Property (c) includes closure (the sets belong to the sigma 
algebra of events) under either finite unions or intersections. 
The qualifier sigma is used to stress that A is closed even under 
countably infinite operations on sets. 


4.70 Probability Measure (9 


Non-Negative function defined on the sigma algebra 
of events (see 4.69) such that 


a) 2 (Q)=1, where Q denotes the sample space 


(see 4.1). 
b) @ (UA) = Zz, (A), Where (A is a 
sequence of pair-wise disjoint events (see 4.2). 
Example: 


Continuing the battery life example of 4.1, consider 
the event that the battery survives less than one hour. 
This event consists of the disjoint pair of events {does 
not function} and {functions less than one hour but 
functions initially}. Equivalently, the events can be 
denoted {0} and (0,1). The probability measure of {0} 
is the proportion of batteries that do not function upon 
the initial attempt. The probability measure of the set 
(0, 1) depends on the specific continuous probability 
distribution [for example, exponential (see 4.58)] 
governing the failure distribution. 
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NOTES 


1A probability measure assigns a value from [0, 1] for each 
event in the sigma algebra of events. The value 0 corresponds 
to an event being impossible, while the value 1 represents 
certainty of occurrence. In particular, the probability measure 
associated with the null set is zero and the probability measure 
assigned to the sample space is 1. 

2 Property (b) indicates that if a sequence of events has no 
elements in common when considered in pairs, then the 
probability measure of the union is the sum of the individual 


3 The three components of the probability are effectively linked 
via random variables. The probabilities (see 4.5) of the events 
in the image set of the random variable (see 4.10) derive from 
the probabilities of events in the sample space. An event in the 
image set of the random variable is assigned the probability of 
the event in the sample space that is mapped onto it by the 
random variable. 

4 The image set of the random variable is the set of real 
numbers or the set of ordered n-tuplets of real numbers. (Note 
that the image set is the set onto which the random variable 


probability measures. As further indicated in property (b), this maps.) 
holds if the number of events is countably infinite. 
ANNEX A 
(Foreword) 
SYMBOLS 
Symbol(s) English Term Clause Symbol(s) English Term Clause 
No. No. 
A event 4.2 P(A/B) conditional probability of 
AS complementary event 4.3 A given B 4.6 
x sigma algebra of events, © fo probability measure 4.70 
algebra, sigma field, o-field 4.69 ns sample correlation coefficient 3.1.23 
a significance level 3.1.45 S observed value of a sample 
a, A, u, B, 0, parameter standard deviation 
p. Y. p, N, M, [o] sample standard deviation 3.1.17 
c, v, a, b, k o? sample variance 3.1.16 
B, coefficient of kurtosis 4.40 Dus sample covariance 3.1.22 
E(X*) sample moment of orderk 3.1.14 S standard deviation 4.37 
E [g(X)] expectation of the function S2 variance 4.36 
g of a random variable X 4.12 Sa covananče 4.43 
F(x) distribution function 4.7 » 
Kx) probability density function 4.26 20 An ne Co Ha 
Yi coefficient of skewness 4.39 Ox standard error of the sample 
H hypothesis 3.1.40 d Mn 
H, ‘will hypothesis 3141 0 parameter of a distribution 
H,, H, alternative hypothesis 3.1.42 6 estimator 3.1.12 
K dimension V(X) variance of a random 
k, ns, order of a moment 3.1.14, variable X 4.36 
4.34, 4.42 X(i) i* order statistic 3.1.9 
Hu mean 4.35 x,y, Z observed value 3.1.4 
V degrees of freedom 4.54 XY ZT random variable 4.10 
n sample size XyXp p-quantile 4.13 
Q sample space 4.1 p-fractile 
(Q, X, (2) probability space 4.68 X,x average, sample mean 3.1.15 
P(A) probability of an event A 4.5 
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ANNEX B 
(Foreword) 


STATISTICAL CONCEPT DIAGRAMS 


+ population (3.1.1) 


esample (3.1.3) 
—-- sampling unit (3.1.2) + distribution function (4.7) 
T 


¢ observed * random sample (3.1.6) 

value (3.1.4) ba 
* random 
variable (4.10) 


+ statistic (3.1.8) _. EN 
13.1.52; 


* simple random * order 
sample (3.1.7) Statistic (3.1.9) 


* descriptive 
Statistics (3.1.5) 


* estimator (3.1.12) 


* sample median (3:1.13) 


+ (extreme order 
statistic) 


m i D 


* sample range (3.1.10) .———————— ——5»- e mid-range(3.1.11) 


Fic. 1 Basic POPULATION AND SAMPLE CONCEPTS 
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* simple random sample (3.1.7) 


+ sample moment of order (3.1.14) 


* sample mean (3.1.15) 


ÁN 


¢ sample coefficient * sample variance * sample coefficient of * sample coefficient 
of variation (3.1.18) (3.1.16) skewness (3.1.20) of kurtosis (3.1.21) 


+ sample correlation — _——__________» * sample standard 
coefficient (3.1.23) deviation (3.1.17) 


* standardized sample 


iance (3.1.22 
+ sample covariance ( ) random variable (3.1.19) 


Fic. 2 CONCEPTS REGARDING SAMPLE MOMENTS 
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* standard error (3.1.24) 


* estimator (3.1. 12) a € estimation (3.1.36) 
* interval + estimate * error of * maximum likelihood * maximum likelihood 
estimator (3.1.25) (3.1.31) “m estimation (3.1.32) estimator (3.1.35) estimation (3.1.37) 
—— on 


* prediction 
interval (3.1.30) 


* bias (3.1.33) * unbiased estimation 


(3.1.34) 
+ parameter * likelihood 
(4.9) j function (3.1.38) 
+ confidence ¢ statistical tolerance * probability wy DT 


function (3.1.39) 


interval (3.1.28) interval (3.1.26) function (4.26) 
* family of sip (kilè 
distributions (4.8) + profile “kelinco 


+ probability mass 


function (4.24) 
* one-sided confidence 


interval (3.1.29) 


* statistical tolerance 
limit (3.1.27) 


Fic. 3 EsriMATION CONCEPTS 
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¢ test statistic (3.1.52) * statistical test (3.1.48) 


* hypothesis (3.1.40) 
* p-value (3.1.49) [NS 


€ null hypothesis (3.1.41) * alternative * simple * composite 


hypothesis (3.1.42) hypothesis (3.1.43) hypothesis (3.1.44) 


* Type | error 
(3.1.46) 


visite, A 


(3.1.47) 


* significance 
level (3.1.45) 


* statistical test (3.1.48) 


* power of a test 
(3.1.50) 
* power curve (3.1.51) 


b" 


¢ family of distribution (4.8) 


Fic. 4 CONCEPTS REGARDING STATISTICAL TESTS 
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* descriptive statistic (3.1.5) 


e graphical 4 numerical 
descriptive statistic (3.1.53) descriptive statistic (3.1.54) * observed value (3.1.4) 


* class (3.1.55) * frequency (3.1.59) 


* class + mid-point of — € class width + frequency ¢ relative 


limits (3.1.56) class (3.1.57) (3.1.58) distribution (3.1.60) frequency (3.1.64) 


¢ cumulative 
frequency (3.1.63) 


¢ (representation of a + cumulative 
frequency distribution ) relative frequency (3.1.65) 
* histogram (3.1.61) * bar chart (3.1.62) 


Fic. 5 CONCEPTS REGARDING CLASSES AND EMPIRICAL DISTRIBUTIONS 
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+ (finite population) 


* (statistical model) 


* (infinite population) # population (3.1.1) — ki 


+ (hypothetical population) 


* sample (3.1.3) * random variable * parameter (4.9) 
(4.10) 


* observed value (3.1.4) 


¢ (inferential statistics) 


Ng 


4 estimation (3.1.36) ¢ (prediction) + statistical test 
(3.1.48) 


Fic. 6 STATISTICAL INFERENCF. CONCEPT DIAGRAM 
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ANNEX C 
(Foreword) 


PROBABILITY CONCEPT DIAGRAMS 


e probability space 
(AN. Y (4.68) 


gc MA + expectation 


(4.12) 


¢ sample space, Q e sigma algebra # probability 
(4.68) of events. N (4.69) measure. co (4.70) 


+ complementary + event (4.2) ». * probability. (4.6) 


event (4.3) 


+ conditional probability family 
of A given B (4.6) of distributions (4.8) 


ba 


€ parameter (4.9): 


* independent event (4.4) C«, x) * distribution 
function (4.7) 
¢ probability * random 
distribution (4.11) variable (4.10) 
* p-quartile (4.13) 
* median (4.14) * quartile (4.15) 


Fic. 7 FUNDAMENTAL CONCEPTS IN PROBABILITY 
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mo ion (4.12 
* p robability # random variable (4.10) * expectation (4.12) 
distribution (4.11) ue ea —————— 


N 
+ joint moment 
of orders r and s (4.41) 


+ discrete random + continuous random 
variable (4.28) variable (4.29) 


+ centered probability + centered random * moment of 


distribution (4.30) variable (4.31) order r (4.34) 


* joint central moment 
of order r and s (4.42) 


* mean (4.35) 


* covariance (4.43) 


* standardized * standardized 
nrobabilitu random 


* coefficient of è variance (4.36) + coefficient of * coefficient of 
variation (4.38) skewness (4.39) — kurtosis (4.40) 
e standard error (3.1.24) 
¢ standard deviation (4.37) + correlation (4.44) 


Fic. 8 CONCEPTS REGARDING MOMENTS 
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* probability distribution (4.11) 


* univariate + multivariate e probability + discrete 4 continuous bi probability 
probability probability mass function probability . probability balana 
ap distribution (4.22 distribution (4.23) mg yg function (4.26) 


distribution (4.16) distribution (4.17) — (4.24) 


| 


* mode of probability * mode of 
mass function (4.25) probability 
density 
function 


(4.27) 


* marginal * conditiorial 
probability probability 
distribution (4.18) distribution (4.19) 


* multinomial ^ € poisson + hypergeometric — 4 negative 


distribution distribution distribution binomial 
(4.45) (4.47) (4.48) distribution (4.49) 


* regression * regression 
curve (4. 20) surface (4. 21) 


* univariate probability 


* multivariate probability 
distribution (4.16) 


distribution (4. 17) 


* binomial 
distribution (4. 46) 


Fic. 9 CONCEPTS REGARDING PROBABILITY DISTRIBUTIONS 
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+ continuous probability 
distribution (4.23) 


* lognormal + normal el eF 
distribution distribution distribution Distribution 
(4.52) (4.50) (4.53) (4.55) 


Ta 


* degrees of 
freedom (4.54) 


# gamma distribution #beta distribution — € (extreme value multivariate 


standardized normal (4.56) (4.59) distribution) normal 
distribution (4.51) distribution 


(4.64) 


4 bivariate normal 
distribution (4.65) 


* chi-squared * exponential * uniform 
distribution distribution distribution 
(4.57) (4.58) (4.60) 


€ standardized 
bivariate normal 
distribution (4.66) 


+ type I extreme + type Il extreme * type III extrem 
value distribution, value distribution, value distribution, 
Gumbel (4.61) Frechet (4.62) Weibull (4.63) 


Fic. 10 Concepts REGARDING CONTINUOUS DISTRIBUTIONS 
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ANNEX D 
(Foreword) 


METHODOLOGY USED IN THE DEVELOPMENT OF THE VOCABULARY 


D-1 INTRODUCTION 


The concepts used in this standard are interrelated, and 
an analysis of these relationships among concepts 
within the field of applied statistics and their 
arrangement into concept diagrams is a prerequisite 
of a coherent and harmonized vocabulary that is easily 
understandable by potential users of applied statistics 
standards. Since the concept diagrams employed during 
the development process may be helpful in an 
informative sense, they are reproduced in D-4. 


D-2 CONTENT OF A VOCABULARY ENTRY 
AND THE SUBSTITUTION RULE 


The concept forms the unit of transfer between 
languages. For each language, the most appropriate 
term for the universal transparency of the concept in 
that language, that is not a literal approach to 
translation, is chosen. 


A definition is formed by describing only those 
characteristics that are essential to identify the concept. 
Information concerning the concept which is important 
but which is not essential to its description is put in 
one or more notes to the definition. 


When a term is substituted by its definition, subject to 
minor syntax changes, there should be no change in 
the meaning of the text. Such a substitution provides a 
simple method for checking the accuracy of a definition. 
However, where the definition is complex in the sense 
that it contains a number of terms, substitution is best 
carried out taking one or, at most, two definitions at a 
time. Complete substitution of the totality of the terms 
will become difficult to achieve syntactically and will 
be unhelpful in conveying meaning. 


D-3 CONCEPT RELATIONSHIPS AND THEIR 
GRAPHICAL REPRESENTATION 


D-3.1 General 


In terminology work, the relationships between 
concepts are based on the hierarchical formation of 
the characteristics of a species so that the most 
economical description of a concept is formed by 
naming its species an describing the characteristics that 
distinguish it from its parent or sibling concepts. 


There are three primary forms of concept relationships 
indicated in this annex: generic (see D-3.2), partitive 
(see D-3.3) and associative (see D-3.4). 
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D-3.2 Generic Relation 


Subordinate concepts within the hierarchy inherit all 
the characteristics of the superordinate concept and 
contain descriptions of these characteristics which 
distinguish them from the superordinate (parent) and 
coordinate (sibling) concepts, for example the relation 
of spring, summer, autumn and winter to season. 


Generic relations are depicted by a fan or tree diagram 
without arrows (see Fig. 11). 


season 


"c ai “W 


spring summer autumn winter 


Fic. 11 GRAPHICAL REPRESENTATION OF A 
GENERIC RELATION 


D.3.3 Partitive Relations 


Subordinate concepts within the hierarchy from 
constituent parts of the superordinate concept, for 
example spring, summer, autumn and winter may be 
defined as parts of the concept year. In comparison, it 
is inappropriate to define sunny weather (one possible 
characteristic of summer) as part of a year. 


Partitive relations are depicted by a rake, without arrows 
(see Fig. 12). Singular parts are depicted by one line, 
multiple parts by double lines. 


year 


spring summer autumn winter 
Fic. 12 GRAPHICAL REPRESENTATION OF A 


PARTITIVE RELATION 


D-3.4 Associative Relation 


Associative relations cannot provide the economies in 
description that are present in generic and partitive 
relations but are helpful in identifying the nature of 
the relationship between one concept and another 
within a concept system, for example cause and effect, 
activity and location, activity and result, tool and 
function, material and product. 


Associative relations are depicted by a line with 
arrowheads at each end (see Fig. 13). 
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Sunshine 4— — — Summer 


Fic. 13 GRAPHICAL REPRESENTATION OF AN 
ASSOCIATIVE RELATION 


D-4 CONCEPT DIAGRAMS 


Figures 1 to 5 show the concept diagrams on which the 
definitions given in 3 of this standard are based. Figure 
6 is an additional concept diagram that indicates the 
relationship of certain terms appearing previously in 
Fig. 1 to Fig. 5. Figures 1 to 4 show the concept 
diagrams on which the definitions given in 4 of this 
standard are based. There are several terms which 
appear in multiple concept diagrams, thus providing a 
linkage among the diagrams, are indicated. These are 
indicated as follows: 


Fig. 1 Basic Population and Sample Concepts 


descriptive statistics (see 3.1.5) Fig. 5 
simple random sample (see 3.1.7) Fig.2 
estimator (see 3.1.12) Fig.3 
test statistic (see 3.1.52) Fig.4 
random variable (see 4.10) Fig. 7, 8 
distribution function (see 4.7) Fig. 7 
Fig. 2 Concepts Regarding Sample Moments 
simple random sample (see 3.1.7) Fig. 1 
Fig. 3 Estimation Concepts 

estimator (see 3.1.12) Fig. 1 
parameter (see 4.9) Fig. 7 
family of distributions (see 4.8) Fig. 4, 7 
probability density function (see 4.26) — Fig.9 
probability mass function (see 4.24) Fig. 9 
Fig. 4 Concepts Regarding Statistical Tests 

test statistic (see 3.1.52) Fig. 1 
probability density function (see 4.26) Fig. 3,9 
probability mass function (see 4.24) Fig. 3,9 
family of distributions (see 4.8) Fig.3,7 


Fig. 5 Concepts Regarding Classes and Empirical 
Distributions 


descriptive statistics (see 3.1.5) Fig. 1 
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Fig. 6 Statistical Inference Concept Diagram 


population (see 3.1.1) Fig. 1 
sample (see 3.1.3) Fig. 1 
observed value (see 3.1.4) Figs. 1, 5 
estimation (see 3.1.36) Fig. 3 
statistical test (see 3.1.48) Fig. 4 
parameter (see 4.9) Fig. 3, 7 
random variable (see 4.10) Fig. 1, 7, 8 


Fig. 7 Fundamental Concepts in Probability 


random variable (see 4.10) Fig. 1, 8 
probability distribution (see 4.11) Fig. 8, 9 
family of distributions (see 4.8) Fig. 3, 4 
distribution function (see 4.7) Fig. 1 
parameter (sec 4.9) Fig. 3 
Fig. 8 Concepts on Moments 

random variable (see 4.10) Fig. 1, 7 
probability distribution (see 4.11) Fig. 7, 8 
Fig. 9 Concepts on Probability Distributions 
probability distribution (see 4.11) Fig. 7, 8 
probability mass function (see 4.24) Fig. 3, 4 
continuous distribution (see 4.23) Fig. 10 
univariate distribution (see 4.16) Fig. 10 
multivariate distribution (see 4.17) Fig. 10 
Fig. 10 Concepts Regarding Continuous Distributions 
univariate distribution (see 4.16) Fig. 9 
multivariate distribution (see 4.17) Fig. 9 
continuous distribution (see 4.23) Fig. 9 


As a final note on Fig. 10, the following distributions 
are examples of univariate distributions: 


normal, £ distribution, F distribution, standardized 
normal, gamma, beta, chi-squared, exponential, 
uniform, Type I extreme value, Type II extreme value 
and Type III extreme value. The following distributions 
are examples of multivariate distributions: multivariate 
normal, bivariate normal and standardized bivariate 
normal. To include univariate distribution (see 4.16) 
and multivariate distribution (see 4.17) in the concept 
diagram would unduly clutter the figure. 
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ALPHABETICAL INDEX 


X distribution 4.57 
o-algebra 4.69 
o-field 4.69 


A 


alternative hypothesis 3.1.42 
arithmetic mean 3.1.15 
average 3.1.15 


bar chart 3.1.62 

beta distribution 4.59 

bias 3.1.33 

binomial distribution 4.46 
bivariate normal distribution 4.65 


C 


centred probability distribution 4.30 
centred random variable 4.31 
chi-squared distribution 4.57 

class 3.1.55.1, 3.1.55.2, 3.1.55.3 
class boundaries 3.1.56 

class limits 3.1.56 

class width 3.1.58 

classes 3.1.55 

coefficient of kurtosis 4.40 
coefficient of skewness 4.39 
coefficient of variation 4.38 
complementary event 4.3 

composite hypothesis 3.1.44 
conditional distribution 4.19 
conditional probability 4.6 
conditional probability distribution 4.19 
confidence interval 3.1.28 
continuous distribution 4.23 
continuous probability distribution 4.23 
continuous random variable 4.29 
correlation coefficient 4.44 
covariance 4.43 

cumulative frequency 3.1.63 
cumulative relative frequency 3.1.65 


D 


degrees of freedom 4.54 

descriptive statistics 3.1.5 

discrete distribution 4.22 

discrete probability distribution 4.22 
discrete random variable 4.28 
distribution 4.11 


distribution function of a random variable X 4.7 


E 


error of estimation 3.1.32 
estimate 3.1.31 

estimation 3.1.36 
estimator 3.1.12 

event 4.2 

expectation 4.12 
exponential distribution 4.58 


F 


F distribution 4.55 

family of distributions 4.8 
Fréchet distribution 4.62 
frequency 3.1.59 

frequency distribution 3.1.60 


G 


gamma distribution 4.56 

Gaussian distribution 4.50 

graphical descriptive statistics 3.1.53 
Gumbel distribution 4.61 


H 


histogram 3.1.61 
hypergeometric distribution 4.48 
hypothesis 3.1.40 


independent events 4.4 
interval estimator 3.1.25 


J 


joint central moment of orders r and s 4.42 
joint moment of orders rand s 4.41 


L 


likelihood function 3.1.38 
lognormal distribution 4.52 


M 


marginal distribution 4.18 

marginal probability distribution 4.18 
maximum likelihood estimation 3.1.37 
maximum likelihood estimator 3.1.35 
mean 3.1.15, 4.35.1, 4.35.2 

median 4.14 

mid-point of class 3.1.57 

mid-range 3.1.11 

mode of probability density function 4.27 
mode of probability mass function 4.25 
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moment of orderr 4.34 

moment of orderr=1 4.35.1 
multinomial distribution 4.45 
multivariate distribution 4.17 
multivariate normal distribution 4.64 
multivariate probability distribution 4.17 


N 


negative binomial distribution 4.49 
normal distribution 4.50 

null hypothesis 3.1.41 

numerical descriptive statistics 3.1.54 


(0) 


observed value 3.1.4 
one-sided confidence interval 3.1.29 
order statistic 3.1.9 


P 


parameter 4.9 

p-fractile 4.13 

Poisson distribution 4.47 
population 3.1.1 

power curve 3.1.51 

power of a test 3.1.50 

p-quantile 4.13 

prediction interval 3.1.30 
probability density function 4.26 
probability distribution 4.11 
probability mass function 4.24 
probability measure 4.70 
probability of aneventA 4.5 
probability space 4.68 

profile likelihood function 3.1.39 
p-value 3.1.49 


Q 
quartile 4.15 


random sample 3.1.6 
random variable 4.10 
rectangular distribution 4.60 
regression curve 4.20 
regression surface 4.21 
relative frequency 3.1.64 
rth moment 4.34 


sample 3.1.3 
sample coefficient of kurtosis 3.1.21 
sample coefficient of skewness 3.1.20 


sample coefficient of variation 3.1.18 
sample correlation coefficient 3.1.23 
sample covariance 3.1.22 

sample mean 3.1.15 

sample median 3.1.13 

sample moment of orderk 3.1.14 
sample range 3.1.10 

sample space 4.1 

sample standard deviation 3.1.17 
sample variance 3.1.16 

sampling distribution 4.67 
sampling unit 3.1.2 

sigma algebra of events 4.69 

sigma field 4.69 

significance level 3.1.45 
significance test 3.1.48 

simple hypothesis 3.1.43 

simple random sample 3.1.7 
standard deviation 4.37 

standard error 3.1.24 


standardized bivariate normal distribution 4.66 


standardized Gaussian distribution 4.51 
standardized normal distribution 4.51 
standardized probability distribution 4.32 
standardized random variable 4.33 


standardized sample random variable 3.1.19 


statistic 3.1.8 

statistical test 3.1.48 

statistical tolerance interval 3.1.26 
statistical tolerance limit 3.1.27 
Student’s distribution 4.53 


T 


t distribution 4.53 

test statistic 3.1.52 

Type error 3.1.46 

Type I extreme value distribution 4.61 

Type Il error 3.1.47 

Type II extreme value distribution 4.62 

Type III extreme value distribution 4.63 
U 


unbiased estimator 3.1.34 

uniform distribution 4.60 

univariate distribution 4.16 

univariate probability distribution 4.16 


V 
variance 4.36 

W 
Weibull distribution 4.63 
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